Cube Objects

Cube

class cr.cube.cube.Cube(response: str | Dict, cube_idx: int | None = None, transforms: Dict | None = None, population: int | None = None, mask_size: int = 0)[source]

Provides access to individual slices on a cube-result.

It also provides some attributes of the overall cube-result.

cube_idx must be None (or omitted) for a single-cube CubeSet. This indicates the CubeSet contains only a single cube and influences behaviors like CA-as-0th.

augment_response(summary_cube_resp) → Cube[source]

Inlfate counts, data and elements dict in case of a single filter cube.

This method is called though the CubeSet while we itearate over all the cube responses. If a cube is a single column filter and its idx > 0 it will be augmented but only if the summary cube shape is different from its own shape.

In a multitable example we can have a text variable on the rows and a single col filter (text) on the columns. From zz9 result the 2 cubes (summary cube and filter cube) will have a different shape. Basically the filter cube will miss the row labes that have 0 as counts.

| FCUBE | CAT |

— | —- | —- |

A  |   1   |  1   |
B  |   1   |  1   |
C  |   1   |  1   |
D  |       |  1   |
E  |       |  1   |
F  |       |  1   |

The FCUBE have D E and F missing cause its results doesn’t count them. And the rendering starts from the top without the correct row label association. For a correct result the FCUBE cube_response needs to be augmentes with all the elements of the summary cube and position the values in the corresponding position of the only existing labels in the response.

| FCUBE | CAT |

— | —- | —- |

A  |   0   |  1   |
B  |   0   |  1   |
C  |   1   |  1   |
D  |   1   |  1   |
E  |   1   |  1   |
F  |   0   |  1   |

available_measures[source]: frozenset of available CUBE_MEASURE members in the cube response.

counts_with_missings[source]

ndarray of weighted, unweighted or valid counts including missing values.

The difference from .counts is that this property includes value for missing categories.

covariance[source]: Optional float64 ndarray of the cube_covariance if the measure exists.

cube_index[source]: Offset of this cube within its CubeSet.

description[source]: Return the description of the cube.

dimension_types[source]: Tuple of DIMENSION_TYPE member for each dimension of cube.

dimensions[source]

List of visible dimensions.

A cube involving a multiple-response (MR) variable has two dimensions for that variable (subvariables and categories dimensions), but is “collapsed” into a single effective dimension for cube-user purposes (its categories dimension is supressed). This collection will contain a single dimension for each MR variable and therefore may have fewer dimensions than appear in the cube response.

has_weighted_counts[source]: True if cube response has weighted count data.

inflate() → Cube[source]

Return new Cube object with rows-dimension added.

A multi-cube (tabbook) response formed from a function (e.g. mean()) on a numeric variable arrives without a rows-dimension.

is_single_filter_col_cube[source]: bool determines if it is a single column filter cube.

means[source]: Optional float64 ndarray of the cube_means if the measure exists.

medians[source]: Optional float64 ndarray of the cube_medians if the measure exists.

missing[source]: Get missing count of a cube.

n_responses[source]: Total (int) number of responses considered.

name[source]

Return the name of the cube.

If the cube has 2 diensions, return the name of the second one. In case of a different number of dimensions, default to returning the name of the last one. In case of no dimensions, return the empty string.

ndim[source]: int count of dimensions for this cube.

overlaps[source]

Optional float64 ndarray of cube_overlaps if the measure exists.

The array has as many dimensions as there are defined in the cube query, plus the extra subvariables dimension as the last dimension.

partitions[source]: Sequence of _Slice, _Strand, or _Nub objects from this cube-result.

population_fraction[source]

The filtered/unfiltered ratio for cube response.

This value is required for properly calculating population on a cube where a filter has been applied. Returns 1.0 for an unfiltered cube. Returns np.nan if the unfiltered count is zero, which would otherwise result in a divide-by-zero error.

stddev[source]: Optional float64 ndarray of the cube_stddev if the measure exists.

sums[source]: Optional float64 ndarray of the cube_sum if the measure exists.

title[source]

str alternate-name given to cube-result.

This value is suitable for naming a Strand when displayed as a column. In this use-case it is a stand-in for the columns-dimension name since a strand has no columns dimension.

unweighted_counts[source]

ndarray of unweighted counts, valid elements only.

Unweighted counts are drawn from the result.counts field of the cube result. These counts are always present, even when the measure is numeric and there are no count measures. These counts are always unweighted, regardless of whether the cube is “weighted”.

In case of presence of valid counts in the cube response the counts are replaced with the valid counts measure.

unweighted_valid_counts[source]: Optional float64 ndarray of unweighted_valid_counts if the measure exists.

valid_counts_summary_range[source]: Optional (min, max) tuple of summary valid counts

valid_overlaps[source]

Optional float64 ndarray of cube_valid_overlaps if the measure exists.

The array has as many dimensions as there are defined in the cube query, plus the extra subvariables dimension as the last dimension.

weighted_counts[source]

ndarray of weighted counts, valid elements only.

In case of presence of valid counts in the cube response the weighted counts are replaced with the valid counts measure.

weighted_squared_counts[source]: Optional float64 ndarray of weighted_squared_counts if the measure exists.

weighted_valid_counts[source]: Optional float64 ndarray of weighted_valid_counts if the measure exists.

CubeSet

class cr.cube.cube.CubeSet(cube_responses: List[Dict], transforms: Dict, population: int, min_base: int)[source]

Represents a multi-cube cube-response.

Also works just fine for a single cube-response passed inside a sequence, allowing uniform handling of single and multi-cube responses.

cube_responses is a sequence of cube-response dicts received from Crunch. The sequence can contain a single item, such as a cube-response for a slide, but it must be contained in a sequence. A tabbook cube-response sequence can be passed as it was received.

transforms is a sequence of transforms dicts corresponding in order to the cube-responses. population is the estimated target population and is used when a population-projection measure is requested. min_base is an integer representing the minimum sample-size used for indicating values that are unreliable by reason of insufficient sample (base).

available_measures[source]: frozenset of available measures of the first cube in this set.

can_show_pairwise[source]: True if all 2D cubes in a multi-cube set can provide pairwise comparison.

description[source]: str description of first cube in this set.

has_numeric_measures[source]

True if cube response contains numeric measures like mean, sum, stddev.

Returns true if any of the numeric cube measure is in the cube response false otherwise.

has_weighted_counts[source]: True if cube-responses include a weighted-count measure.

is_ca_as_0th[source]

True for multi-cube when first cube represents a categorical-array.

A “CA-as-0th” tabbook tab is “3D” in the sense it is “sliced” into one table (partition-set) for each of the CA subvariables.

missing_count[source]: The number of missing values from first cube in this set.

n_responses[source]: Total number of responses considered from first cube in this set.

name[source]: str name of first cube in this set.

partition_sets[source]

Sequence of cube-partition collections across all cubes of this cube-set.

This value might look like the following for a ca-as-0th tabbook. For example:

(
    (_Strand, _Slice, _Slice),
    (_Strand, _Slice, _Slice),
    (_Strand, _Slice, _Slice),
)

and might often look like this for a typical slide:

((_Slice,))

Each partition set represents the partitions for a single “stacked” table. A 2D slide has a single partition-set of a single _Slice object, as in the second example above. A 3D slide would have multiple partition sets, each of a single _Slice. A tabook will have multiple partitions in each set, the first being a _Strand and the rest being _Slice objects. Multiple partition sets only arise for a tabbook in the CA-as-0th case.

population_fraction[source]

The filtered/unfiltered ratio for this cube-set.

This value is required for properly calculating population on a cube where a filter has been applied. Returns 1.0 for an unfiltered cube. Returns np.nan if the unfiltered count is zero, which would otherwise result in a divide-by-zero error.

valid_counts_summary_range[source]: The valid count summary values from first cube in this set.