API Reference¶
dazed.confusion_matrix¶
Confusion matrix module.
-
class
dazed.confusion_matrix.ConfusionMatrix(y1, y2, labels=None, info=None)¶ Construct a confusion matrix.
Creates a confusion matrix from multiple different data formats and provides useful methods for exploring the data.
-
__init__(y1, y2, labels=None, info=None)¶ Contruct a confusion matrix from sparse values.
In most cases it’s recommended that you use the “from…” methods instead as they offer additional support for multilabel data.
- Parameters
y1 (
List[Union[str,int]]) – A list of true labels.y2 (
List[Union[str,int]]) – A list of predicted labels.labels (
Optional[List[Union[str,int]]]) – A list of all possible labels (in case not present in y1 and y2).info (
Optional[List[Any]]) – A list containing any additional info about each sample.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> ConfusionMatrix(truth, pred) | 0 1 2 index | label --------- ------------- 0 | 1 1 0 0 | cat 1 | 1 1 0 1 | dog 2 | 0 0 1 2 | fish --------- -------------
-
as_array(present_only=True)¶ Get confusion matrix as an array.
- Parameters
present_only (
bool) – Whether to return an matrix that only includes labels present in y1 and/or y2.- Return type
Tuple[ndarray,List[Union[str,int]]]- Returns
A confusion matrix as a numpy array. A list of the confusion matrices labelss.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> ConfusionMatrix.from_sparse(truth, pred).as_array() (array([[1, 1, 0], [1, 1, 0], [0, 0, 1]]), ['cat', 'dog', 'fish'])
-
as_df(present_only=True)¶ Get confusion matrix as df.
- Parameters
present_only (
bool) – Whether to return an matrix that only includes labels present in y1 and/or y2.- Return type
DataFrame- Returns
A confusion matrix as pandas dataframe.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> ConfusionMatrix.from_sparse(truth, pred).as_df() cat dog fish cat 1 1 0 dog 1 1 0 fish 0 0 1
-
as_str(present_only=True)¶ Get confusion matrix as a string.
- Parameters
present_only (
bool) – Whether to return an matrix that only includes labels present in y1 and/or y2.- Return type
str- Returns
A confusion matrix string.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> print(ConfusionMatrix.from_sparse(truth, pred).as_str()) | 0 1 2 index | label --------- ------------- 0 | 1 1 0 0 | cat 1 | 1 1 0 1 | dog 2 | 0 0 1 2 | fish --------- -------------
-
classmethod
from_df(cls, df, y1_names, y2_names, labels=None, info_names=None)¶ Contruct a confusion matrix from a pandas dataframe.
- Parameters
df (
DataFrame) – A pandas dataframe containing either a column of sparse labels or multiple columns of onehot encoded values.y1_names (
Union[List[str],str]) – True column name or a list of prediction column names if multilabel.y2_names (
Union[List[str],str]) – Prediction column name or a list of prediction column names if multilabel.labels (
Optional[List[Union[str,int]]]) – A list of all possible labels (in case not present in y1 and y2).info_names (
Optional[List[str]]) – A list of column names to use for additional sample info.
- Returns
A confusion matrix.
- Raises
ValueError – If label y1_names or y2 names are not the correct type.
Example
>>> sparse_df = pd.DataFrame() >>> sparse_df["truth"] = ["cat", "dog", "cat", "dog", "fish"] >>> sparse_df["pred"] = ["cat", "dog", "dog", "cat", "fish"] >>> ConfusionMatrix.from_df(sparse_df, "truth", "pred") | 0 1 2 index | label --------- ------------- 0 | 1 1 0 0 | cat 1 | 1 1 0 1 | dog 2 | 0 0 1 2 | fish --------- ------------- >>> onehot_df = pd.DataFrame() >>> onehot_df["cat_truth"] = [0, 1, 0, 1] >>> onehot_df["dog_truth"] = [1, 0, 1, 0] >>> onehot_df["cat_pred"] = [0, 1, 1, 0] >>> onehot_df["dog_pred"] = [1, 0, 0, 1] >>> ConfusionMatrix.from_df( ... onehot_df, ... ["cat_truth", "dog_truth"], ... ["cat_pred", "dog_pred"], ... ["cat", "dog"], ... ) | 0 1 index | label ------- ------------- 0 | 1 1 0 | cat 1 | 1 1 1 | dog ------- -------------
-
classmethod
from_onehot(y1, y2, labels=None, info=None, multilabel=False)¶ Contruct a confusion matrix from onehot encoded values.
- Parameters
y1 (
ndarray) – An array of onehot encoded values of shape [num_samples, num_labels].y2 (
ndarray) – An array of onehot encoded values of shape [num_samples, num_labels].labels (
Optional[List[Union[str,int]]]) – A list of label names, in the same order as the columns of y1 and y2.info (
Optional[List[Any]]) – A list containing any additional info about each sample.multilabel (
bool) – Indicates whether each sample can have multiple labels.
- Returns
A confusion matrix.
Example
>>> truth = np.array([[0, 1], [1, 0], [0, 1], [1, 0]]) >>> pred = np.array([[0, 1], [1, 0], [1, 0], [0, 1]]) >>> ConfusionMatrix.from_onehot(truth, pred, ["cat", "dog"]) | 0 1 index | label ------- ------------- 0 | 1 1 0 | cat 1 | 1 1 1 | dog ------- -------------
-
classmethod
from_sparse(y1, y2, labels=None, info=None, multilabel=False)¶ Contruct a confusion matrix from sparse values.
- Parameters
y1 (
Union[List[Union[str,int]],List[List[Union[str,int]]]]) – A list of true labels (a list of lists if multilabel).y2 (
Union[List[Union[str,int]],List[List[Union[str,int]]]]) – A list of predicted labels (a list of lists if multilabel).labels (
Optional[List[Union[str,int]]]) – A list of all possible labels (in case not present in y1 and y2).info (
Optional[List[Any]]) – A list containing any additional info about each sample.multilabel (
bool) – Indicates whether each sample can have multiple labels.
- Returns
A confusion matrix.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> ConfusionMatrix.from_sparse(truth, pred) | 0 1 2 index | label --------- ------------- 0 | 1 1 0 0 | cat 1 | 1 1 0 1 | dog 2 | 0 0 1 2 | fish --------- -------------
-
label_pair_info(label_1, label_2)¶ Get a sample information by label pair.
- Parameters
label_1 (
Union[int,str]) – A true label.label_2 (
Union[int,str]) – A predicted label.
- Return type
List[Any]- Returns
A list of info for samples that had a true label of label_1 and predicted label of label_2.
- Raises
ValueError – if label not present.
Example
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "dog", "dog", "cat", "fish"] >>> filenames = ["img0.jpg", "img1.jpg", "img2.jpg", "img3.jpg", "img4.jpg"] >>> cm = ConfusionMatrix.from_sparse(truth, pred, info=filenames) >>> cm.label_pair_info("cat", "dog") ['img2.jpg']
-
most_confused()¶ Get a list of label confusions and counts.
- Return type
List[Tuple[Union[int,str],Union[int,str],int]]- Returns
A list of tuples of format (label1, label1, number of confusions).
>>> truth = ["cat", "dog", "cat", "dog", "fish"] >>> pred = ["cat", "cat", "dog", "cat", "fish"] >>> cm = ConfusionMatrix.from_sparse(truth, pred) >>> cm.most_confused() [('dog', 'cat', 2), ('cat', 'dog', 1)]
-