pine.quantitative_evaluation.text_classification package

Submodules

pine.quantitative_evaluation.text_classification.data module

class pine.quantitative_evaluation.text_classification.data.Dataset(name: str, path: pathlib.Path, split_idx: int)

Bases: object

load()
class pine.quantitative_evaluation.text_classification.data.Document(words: List[str], target: int)

Bases: object

pine.quantitative_evaluation.text_classification.data.load_kusner_datasets(path: pathlib.Path) List[pine.quantitative_evaluation.text_classification.data.Dataset]

pine.quantitative_evaluation.text_classification.evaluation module

class pine.quantitative_evaluation.text_classification.evaluation.Evaluator(dataset: pine.quantitative_evaluation.text_classification.data.Dataset, model: pine.language_model.LanguageModel, method: str)

Bases: object

evaluate() float
class pine.quantitative_evaluation.text_classification.evaluation.ParallelCachingWmdSimilarity(corpus: List[List[str]], vectors: gensim.models.keyedvectors.KeyedVectors, cache_path: pathlib.Path, num_best: Optional[int] = None, chunksize: int = 256)

Bases: gensim.interfaces.SimilarityABC

get_similarities(queries: List[List[str]]) numpy.ndarray

Get similarities of the given document or corpus against this index.

Parameters

doc ({list of (int, number), iterable of list of (int, number)}) – Document in the sparse Gensim bag-of-words format, or a streamed corpus of such documents.

pine.quantitative_evaluation.text_classification.evaluation.wmdistance(query: List[str], document: List[str]) float

pine.quantitative_evaluation.text_classification.text_classification module

class pine.quantitative_evaluation.text_classification.text_classification.Result(result: List[float])

Bases: object

pine.quantitative_evaluation.text_classification.text_classification.evaluate(dataset_path: pathlib.Path, language_model: pine.language_model.LanguageModel, method: str) pine.quantitative_evaluation.text_classification.text_classification.Result
pine.quantitative_evaluation.text_classification.text_classification.get_dataset_paths(language: str, dataset_dir: pathlib.Path) List[pathlib.Path]

Module contents

class pine.quantitative_evaluation.text_classification.Result(result: List[float])

Bases: object

pine.quantitative_evaluation.text_classification.evaluate(dataset_path: pathlib.Path, language_model: pine.language_model.LanguageModel, method: str) pine.quantitative_evaluation.text_classification.text_classification.Result
pine.quantitative_evaluation.text_classification.get_dataset_paths(language: str, dataset_dir: pathlib.Path) List[pathlib.Path]