Programa%20final/spacy/pipeline/__pycache__/textcat.cpython-312.pyc

Ë
>û g<ãó—ddlmZddlmZmZmZmZmZmZm	Z	ddl
Z
ddlmZm
Z
mZmZmZddlmZddlmZddlmZdd	lmZdd
lmZddlmZmZmZddlm Z dd
l!m"Z"ddl#m$Z$dZ%e«jMe%«dZ'dZ(dZ)ejTddgde'ddidœdddddddddddœ
¬«dede+de
eeeefde,d eed!d"fd#„«Z-d$eed!ee+effd%„Z.e j^d«d&„«Z0Gd'„d"e$«Z1y)(é)Úislice)ÚAnyÚCallableÚDictÚIterableÚListÚOptionalÚTupleN)ÚConfigÚModelÚ	OptimizerÚget_array_moduleÚset_dropout_rate)ÚFloats2dé)ÚErrors)ÚLanguage)ÚScorer)ÚDoc)ÚExampleÚvalidate_examplesÚvalidate_get_examples)Úregistry)ÚVocabé)Ú
TrainablePipeaW
[model]
@architectures = "spacy.TextCatEnsemble.v2"

[model.tok2vec]
@architectures = "spacy.Tok2Vec.v2"

[model.tok2vec.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = 64
rows = [2000, 2000, 500, 1000, 500]
attrs = ["NORM", "LOWER", "PREFIX", "SUFFIX", "SHAPE"]
include_static_vectors = false

[model.tok2vec.encode]
@architectures = "spacy.MaxoutWindowEncoder.v2"
width = ${model.tok2vec.embed.width}
window_size = 1
maxout_pieces = 3
depth = 2

[model.linear_model]
@architectures = "spacy.TextCatBOW.v3"
exclusive_classes = true
length = 262144
ngram_size = 1
no_output_layer = false
Úmodelz€
[model]
@architectures = "spacy.TextCatBOW.v3"
exclusive_classes = true
length = 262144
ngram_size = 1
no_output_layer = false
a`
[model]
@architectures = "spacy.TextCatReduce.v1"
exclusive_classes = true
use_reduce_first = false
use_reduce_last = false
use_reduce_max = false
use_reduce_mean = true

[model.tok2vec]
@architectures = "spacy.HashEmbedCNN.v2"
pretrained_vectors = null
width = 96
depth = 4
embed_size = 2000
window_size = 1
maxout_pieces = 3
subword_features = true
Útextcatzdoc.catsçz@scorerszspacy.textcat_scorer.v2)Ú	thresholdrÚscorerçð?)
Ú
cats_scoreÚcats_score_descÚcats_micro_pÚcats_micro_rÚcats_micro_fÚcats_macro_pÚcats_macro_rÚcats_macro_fÚcats_macro_aucÚcats_f_per_type)ÚassignsÚdefault_configÚdefault_score_weightsÚnlpÚnamer r!ÚreturnÚTextCategorizercó6—t|j||||¬«S)aÏCreate a TextCategorizer component. The text categorizer predicts categories
    over a whole document. It can learn one or more labels, and the labels are considered
    to be mutually exclusive (i.e. one true label per doc).

    model (Model[List[Doc], List[Floats2d]]): A model instance that predicts
        scores for each category.
    threshold (float): Cutoff to consider a prediction "positive".
    scorer (Optional[Callable]): The scoring method.
    )r r!)r3Úvocab)r0r1rr r!s     úWC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/pipeline/textcat.pyÚmake_textcatr7Ms€ôJ˜3Ÿ9™9 e¨T¸YÈvÔVÐVóÚexamplescó4—tj|dfddi|¤ŽS)NÚcatsÚmulti_labelF)rÚ
score_cats)r9Úkwargss  r6Ú
textcat_scorer?us/€Ü×ÑØØñðððñ	ðr8có—tS©N)r?©r8r6Úmake_textcat_scorerrC~s€äÐr8có —eZdZdZ	d$edœdedededede	e
dd	fd
„Zed„«Z
edeefd„«Zedeefd
„«Zdeefd„Zdeedd	fd„Zdd	d	dœdeedede	ede	eeefdeeeff
d„Zdd	d	dœdeedede	ede	eeefdeeeff
d„Zdeedeej8ej8ffd„Zdeedeeeffd„Zdedefd„Z d	d	d	dœde
geefde	e!d e	eed!e	edd	f
d"„Z"deefd#„Z#y	)%r3zmPipeline component for single-label text classification.

    DOCS: https://spacy.io/api/textcategorizer
    )r!r5rr1r r!r2Ncóv—||_||_||_d|_g|ddœ}t	|«|_||_y)aaInitialize a text categorizer for single-label classification.

        vocab (Vocab): The shared vocabulary.
        model (thinc.api.Model): The Thinc Model powering the pipeline component.
        name (str): The component instance name, used to add entries to the
            losses during training.
        threshold (float): Unused, not needed for single-label (exclusive
            classes) classification.
        scorer (Optional[Callable]): The scoring method. Defaults to
                Scorer.score_cats for the attribute "cats".

        DOCS: https://spacy.io/api/textcategorizer#init
        N)Úlabelsr Úpositive_label)r5rr1Ú_rehearsal_modelÚdictÚcfgr!)Úselfr5rr1r r!rJs       r6Ú__init__zTextCategorizer.__init__‰sE€ð,ˆŒ
ØˆŒ
ØˆŒ	Ø $ˆÔàØ"Ø"ñ
ˆô
˜“9ˆŒØˆ�r8có—y)NFrB©rKs r6Úsupport_missing_valuesz&TextCategorizer.support_missing_values«s€ð
r8có2—t|jd«S)z†RETURNS (Tuple[str]): The labels currently added to the component.

        DOCS: https://spacy.io/api/textcategorizer#labels
        rF)ÚtuplerJrNs r6rFzTextCategorizer.labels²s€ô�T—X‘X˜hÑ'Ó(Ð(r8có—|jS)z†RETURNS (List[str]): Information about the component's labels.

        DOCS: https://spacy.io/api/textcategorizer#label_data
        )rFrNs r6Ú
label_datazTextCategorizer.label_dataºs€ð�{‰{Ðr8Údocscóš—td„|D««ss|D�cgc]}|j‘Œ}}|jjj}|jt
t|««t
|j«f«}|S|jj|«}|jjj|«}|Scc}w)zþApply the pipeline's model to a batch of docs, without modifying them.

        docs (Iterable[Doc]): The documents to predict.
        RETURNS: The models prediction for each document.

        DOCS: https://spacy.io/api/textcategorizer#predict
        c3ó2K—|]}t|«–—ŒywrA©Úlen©Ú.0Údocs  r6ú	<genexpr>z*TextCategorizer.predict.<locals>.<genexpr>Êóèø€Ð,¡t ”3�s—8¡tùó‚)ÚanyÚtensorrÚopsÚxpÚzerosrXÚlistrFÚpredictÚasarray)rKrTr[ÚtensorsrbÚscoress      r6rezTextCategorizer.predictÂsœ€ôÑ,¡tÓ,Ô,á-1Ó2©T c�s—z“z¨TˆGÐ2Ø—‘—‘×"Ñ"ˆBØ—X‘Xœs¤4¨£:›´°D·K±KÓ0@ÐAÓBˆFØˆMØ—‘×#Ñ# DÓ)ˆØ—‘—‘×'Ñ'¨Ó/ˆØˆ
ùò
3s—Ccóž—t|«D]?\}}t|j«D]"\}}t|||f«|j|<Œ$ŒAy)aModify a batch of Doc objects, using pre-computed scores.

        docs (Iterable[Doc]): The documents to modify.
        scores: The scores to set, produced by TextCategorizer.predict.

        DOCS: https://spacy.io/api/textcategorizer#set_annotations
        N)Ú	enumeraterFÚfloatr;)rKrTrhÚir[ÚjÚlabels       r6Úset_annotationszTextCategorizer.set_annotationsÔsG€ô  –o‰FˆAˆsÜ% d§k¡kÖ2‘��5Ü"'¨¨q°!¨t©Ó"5�—‘˜’ñ3ñ&r8r)ÚdropÚsgdÚlossesr9rprqrrcóØ—|€i}|j|jd«t|d«|j|«t	d„|D««s|St|j|«|jj|D�cgc]}|j‘Œc}«\}}|j||«\}}	||	«|�|j|«||jxx|z
cc<|Scc}w)a1Learn from a batch of documents and gold-standard information,
        updating the pipe's model. Delegates to predict and get_loss.

        examples (Iterable[Example]): A batch of Example objects.
        drop (float): The dropout rate.
        sgd (thinc.api.Optimizer): The optimizer.
        losses (Dict[str, float]): Optional record of the loss during training.
            Updated using the component name as the key.
        RETURNS (Dict[str, float]): The updated losses dictionary.

        DOCS: https://spacy.io/api/textcategorizer#update
        rzTextCategorizer.updatec3óbK—|]'}|jrt|j«nd–—Œ)yw)rN)Ú	predictedrX)rZÚegs  r6r\z)TextCategorizer.update.<locals>.<genexpr>ùs%èø€ÐOÁhÀ¨¯ª”3�r—|‘|Ô$¸!Ó;Áhùs‚-/)Ú
setdefaultr1rÚ_validate_categoriesr_rrÚbegin_updateruÚget_lossÚ
finish_update)
rKr9rprqrrrvrhÚ	bp_scoresÚlossÚd_scoress
          r6ÚupdatezTextCategorizer.updateàsÖ€ð(ˆ>ØˆFØ×Ñ˜$Ÿ)™) SÔ)Ü˜(Ð$<Ô=Ø×!Ñ! (Ô+ÜÑOÁhÓOÔOàˆMÜ˜Ÿ™ TÔ*Ø ŸJ™J×3Ñ3ÉHÓ4UÉHÀb°R·\³\ÈHÑ4UÓVÑˆ�	ØŸ™ x°Ó8‰ˆˆhÙ�(ÔØˆ?Ø×Ñ˜sÔ#Øˆt�y‰yÓ˜TÑ!ÓØˆ
ùò
5VsÂC'có4—|€i}|j|jd«|j€|St|d«|j	|«|D�cgc]}|j
‘Œ}}t
d„|D««s|St|j|«|jj|«\}}|jj|«\}	}
||	z
}||«|�|j|«||jxx|dzj«z
cc<|Scc}w)a«Perform a "rehearsal" update from a batch of data. Rehearsal updates
        teach the current model to make predictions similar to an initial model,
        to try to address the "catastrophic forgetting" problem. This feature is
        experimental.

        examples (Iterable[Example]): A batch of Example objects.
        drop (float): The dropout rate.
        sgd (thinc.api.Optimizer): The optimizer.
        losses (Dict[str, float]): Optional record of the loss during training.
            Updated using the component name as the key.
        RETURNS (Dict[str, float]): The updated losses dictionary.

        DOCS: https://spacy.io/api/textcategorizer#rehearse
        rzTextCategorizer.rehearsec3ó2K—|]}t|«–—ŒywrArWrYs  r6r\z+TextCategorizer.rehearse.<locals>.<genexpr>#r]r^r)rwr1rHrrxrur_rrryr{Úsum)rKr9rprqrrrvrTrhr|ÚtargetÚ_Úgradients            r6ÚrehearsezTextCategorizer.rehearses€ð,ˆ>ØˆFØ×Ñ˜$Ÿ)™) SÔ)Ø× Ñ Ð(ØˆMÜ˜(Ð$>Ô?Ø×!Ñ! (Ô+Ù'/Ó0¡x �—“ xˆÐ0ÜÑ,¡tÓ,Ô,àˆMÜ˜Ÿ™ TÔ*Ø ŸJ™J×3Ñ3°DÓ9Ñˆ�	Ø×)Ñ)×6Ñ6°tÓ<‰	ˆ�Ø˜F‘?ˆÙ�(ÔØˆ?Ø×Ñ˜sÔ#Øˆt�y‰yÓ˜h¨™k×.Ñ.Ó0Ñ0ÓØˆ
ùò1sÁDcó"—tt|««}tj|t|j«fd¬«}tj
|t|j«fd¬«}t
|«D]m\}}t
|j«D]P\}}||jjvr|jj||||f<Œ=|jsŒJd|||f<ŒRŒo|jjj|«}||fS)NÚf)Údtyper)
rXrdÚnumpyrcrFÚonesrjÚ	referencer;rOrrarf)	rKr9Únr_examplesÚtruthsÚnot_missingrlrvrmrns	         r6Ú_examples_to_truthz"TextCategorizer._examples_to_truth0sâ€ôœ$˜x›.Ó)ˆÜ—‘˜k¬3¨t¯{©{Ó+;Ð<ÀCÔHˆÜ—j‘j +¬s°4·;±;Ó/?Ð!@ÈÔLˆÜ˜xÖ(‰EˆAˆrÜ% d§k¡kÖ2‘��5Ø˜BŸL™L×-Ñ-Ñ-Ø#%§<¡<×#4Ñ#4°UÑ#;�F˜1˜a˜4’LØ×0Ó0Ø(+�K  1 Ò%ñ	3ð)ð—‘—‘×'Ñ'¨Ó/ˆØ�{Ð"Ð"r8có—t|d«|j|«|j|«\}}|jjj|«}||z
}||z}|dzj
«}t|«|fS)aeFind the loss and gradient of loss for the batch of documents and
        their predicted scores.

        examples (Iterable[Examples]): The batch of examples.
        scores: Scores representing the model's predictions.
        RETURNS (Tuple[float, float]): The loss and the gradient.

        DOCS: https://spacy.io/api/textcategorizer#get_loss
        zTextCategorizer.get_lossr)rrxr�rrarfÚmeanrk)rKr9rhrŽr�r~Úmean_square_errors       r6rzzTextCategorizer.get_loss?s�€ô	˜(Ð$>Ô?Ø×!Ñ! (Ô+Ø"×5Ñ5°hÓ?Ñˆ�Ø—j‘j—n‘n×,Ñ,¨[Ó9ˆØ˜F‘?ˆØ�KÑˆØ% q™[×.Ñ.Ó0ÐÜÐ&Ó'¨Ð1Ð1r8rncóæ—t|t«sttj«‚||j
vry|j
«|jdj|«|jrZd|jjvrB|jjd|jt|j
««|_	|jjj|«y)zÎAdd a new label to the pipe.

        label (str): The label to add.
        RETURNS (int): 0 if label is already present, otherwise 1.

        DOCS: https://spacy.io/api/textcategorizer#add_label
        rrFÚ
resize_outputr)Ú
isinstanceÚstrÚ
ValueErrorrÚE187rFÚ_allow_extra_labelrJÚappendrÚattrsrXr5ÚstringsÚadd)rKrns  r6Ú	add_labelzTextCategorizer.add_labelRs®€ô˜%¤Ô%ÜœVŸ[™[Ó)Ð)Ø�D—K‘KÑØØ×ÑÔ!Ø�‰�Ñ×!Ñ! %Ô(Ø�:Š:˜/¨T¯Z©Z×-=Ñ-=Ñ=Ø:˜Ÿ™×)Ñ)¨/Ñ:¸4¿:¹:ÄsÈ4Ï;É;ÓGWÓXˆDŒJØ�
‰
×Ñ×Ñ˜uÔ%Ør8)r0rFrGÚget_examplesr0rFrGcó|—t|d«|j|««|€9|«D].}|jjD]}|j	|«ŒŒ0n|D]}|j	|«Œt|j«dkrttj«‚|�’||jvr6tjj||j¬«}t|«‚t|j«dk7r6tjj||j¬«}t|«‚||jd<tt|«d««}	|	D�
cgc]}
|
j ‘Œ}}
|j#|	«\}}
|j%«t|«dkDs/Jtj&j|j(¬««‚t|«dkDs/Jtj&j|j(¬««‚|j*j-||¬	«ycc}
w)
aInitialize the pipe for training, using a representative set
        of data examples.

        get_examples (Callable[[], Iterable[Example]]): Function that
            returns a representative sample of gold-standard Example objects.
        nlp (Language): The current nlp object the component is part of.
        labels (Optional[Iterable[str]]): The labels to add to the component, typically generated by the
            `init labels` command. If no labels are provided, the get_examples
            callback is used to extract the labels from the data.
        positive_label (Optional[str]): The positive label for a binary task with exclusive classes,
            `None` otherwise and by default.

        DOCS: https://spacy.io/api/textcategorizer#initialize