Files
INTUIA/Programa final/spacy/pipeline/__pycache__/tok2vec.cpython-312.pyc
T

145 lines
17 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
>û g5ã óbddlmZddlmZmZmZmZmZmZm Z ddl
m Z m Z m
Z
mZddlmZddlmZddlmZddlmZmZmZdd lmZd
d lmZd Ze «j=e«d
Zej@ddgd
ei¬«dede!d
e ddfd«Z"Gdde«Z#Gdde «Z$d
e$de%fdZ&dZ'y)é)Úislice)ÚAnyÚCallableÚDictÚIterableÚListÚOptionalÚSequence)ÚConfigÚModelÚ OptimizerÚset_dropout_rateé)ÚErrors)ÚLanguage)ÚDoc)ÚExampleÚvalidate_examplesÚvalidate_get_examples)ÚVocabé)Ú
TrainablePipez­
[model]
@architectures = "spacy.HashEmbedCNN.v2"
pretrained_vectors = null
width = 96
depth = 4
embed_size = 2000
window_size = 1
maxout_pieces = 3
subword_features = true
ÚmodelÚtok2vecz
doc.tensor)ÚassignsÚdefault_configÚnlpÚnameÚreturnÚTok2Veccó0t|j||«S©N)r Úvocab)rrrs úWC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/pipeline/tok2vec.pyÚ make_tok2vecr%sô 3—99˜e  c ó2eZdZdZd dedededdfdZede dfd „«Z
ede efd
«Z d dd eddfd
Z d dd ede
fdZd!dZdeefdZdeeddfdZddddœdeededeedeeeeffdZd!dZddœdegeefdeefdZdZy)"r aAApply a "token-to-vector" model and set its outputs in the doc.tensor
attribute. This is mostly useful to share a single subnetwork between multiple
components, e.g. to have one embedding and CNN network shared between a
parser, tagger and NER.
In order to use the `Tok2Vec` predictions, subsequent components should use
the `Tok2VecListener` layer as the tok2vec subnetwork of their model. This
layer will read data from the `doc.tensor` attribute during prediction.
During training, the `Tok2Vec` component will save its prediction and backprop
callback for each batch, so that the subsequent components can backpropagate
to the shared weights. This implementation is used because it allows us to
avoid relying on object identity within the models to achieve the parameter
sharing.
r#rrrNcóJ||_||_||_i|_i|_y)a“Initialize a tok2vec component.
vocab (Vocab): The shared vocabulary.
model (thinc.api.Model[List[Doc], List[Floats2d]]):
The Thinc Model powering the pipeline component. It should take
a list of Doc objects as input, and output a list of 2d float arrays.
name (str): The component instance name.
DOCS: https://spacy.io/api/tok2vec#init
N)r#rrÚ listener_mapÚcfg)Úselfr#rrs r$Ú__init__zTok2Vec.__init__2s(ðˆŒ
؈Œ
؈Œ Ø@BˆÔØ#%ˆr&ÚTok2VecListenercón|jDcgc]}|j|D]}|ŒŒc}}Scc}}w)zuRETURNS (List[Tok2VecListener]): The listener models listening to this
component. Usually internals.
)Úlistening_componentsr))r+Úms r$Ú listenerszTok2Vec.listenersCs9ð
 ×4a¸d×>OÑ>OÐPQÔ>R¸Ð>RÐSùÓSs1cóHt|jj««S)zoRETURNS (List[str]): The downstream components listening to this
component. Usually internals.
)Úlistr)Úkeys)r+s r$r/zTok2Vec.listening_componentsJsô
%×-r&ÚlistenerÚcomponent_namecóœ|jj|g«||j|vr|j|j|«yy)z=Add a listener for a downstream component. Usually internals.N)r)Ú
setdefaultÚappend©r+r6r7s r$Ú add_listenerzTok2Vec.add_listenerQsIà ×Ñ×$ ^°RÔ ˜4×,¨^Ñ × Ñ ˜ -× 4°XÕ  =r&có¸||jvrL||j|vr;|j|j|«|j|s
|j|=yy)z@Remove a listener for a downstream component. Usually internals.TF)r)Úremover;s r$Úremove_listenerzTok2Vec.remove_listenerWs^à ˜T× ˜4×,¨^Ñ×! 1×Ô×Ò×)¨.ÐØr&cód|jf}tt|dd«t«r\|jj «D]>}t|t «sŒ|j|vsŒ#|j||j«Œ@yy)Walk over a model of a processing component, looking for layers that
are Tok2vecListener subclasses that have an upstream_name that matches
this component. Listeners can also set their upstream_name attribute to
the wildcard string '*' to match any `Tok2Vec`.
You're unlikely to ever need multiple `Tok2Vec` components, so it's
fine to leave your listeners upstream_name on '*'.
Ú*rN) rÚ
isinstanceÚgetattrr rÚwalkr-Ú
upstream_namer<)r+Ú componentÚnamesÚnodes r$Úfind_listenerszTok2Vec.find_listenersbsmðd—i‘iÐ ˆÜ ”g˜i¨°$ÓÔ ×.ܘd¤OÕ×9KÑ9KÈuÒ9TØ×% d¨I¯N©NÕ @r&Údocscó
td|D««sP|jjd«}|Dcgc])}|jjj d|f«Œ+c}S|jj |«}|Scc}w)a?Apply the pipeline's model to a batch of docs, without modifying them.
Returns a single tensor for a batch of documents.
docs (Iterable[Doc]): The documents to predict.
RETURNS: Vector representations for each token in the documents.
DOCS: https://spacy.io/api/tok2vec#predict
c3ó2K|]}t|«Œy­wr")Úlen©Ú.0Údocs r$ú <genexpr>z"Tok2Vec.predict.<locals>.<genexpr>zsèø€Ð,¡t ”3s—8¡tùsÚnOr)ÚanyrÚget_dimÚopsÚallocÚpredict)r+rJÚwidthrPÚtokvecss r$rWzTok2Vec.predictqsqôÑ,¡tÓ—JJ×& ,ˆEÙ@DÓ¸D—J‘J—N‘N×(¨!¨U¨ÕÑ —*‘*×$ TÓØˆùòFs².Bcótt||«D])\}}|jdt|«k(sJ||_Œ+y)zøModify a batch of documents, using pre-computed scores.
docs (Iterable[Doc]): The documents to modify.
tokvecses: The tensors to set, produced by Tok2Vec.predict.
DOCS: https://spacy.io/api/tok2vec#set_annotations
rN)ÚzipÚshaperMÚtensor)r+rJÚ tokvecsesrPrYs r$Úset_annotationszTok2Vec.set_annotationss;ô   0‰LˆCØ—=‘= Ñ#¤s¨3£xÒ  ˆC1r&ç)ÚdropÚsgdÚlossesÚexamplesrarbrccó‚
iŠt|d«|Dcgc]}|jŒ}}tj|«jj |«\ŠŠ Dcgc]/}jj
j |jŽŒ1c}Š
jjd«ˆ
ˆˆˆfdŠ ˆ ˆ ˆ
ˆˆfd}tj|«} ‰jddD]}
|
j| ‰ «Œjr ‰jdj| ‰|«Scc}wcc}w)aLearn from a batch of documents and gold-standard information,
updating the pipe's model.
examples (Iterable[Example]): A batch of Example objects.
drop (float): The dropout rate.
sgd (thinc.api.Optimizer): The optimizer.
losses (Dict[str, float]): Optional record of the loss during training.
Updated using the component name as the key.
RETURNS (Dict[str, float]): The updated losses dictionary.
DOCS: https://spacy.io/api/tok2vec#update
NzTok2Vec.updater`có>tt|««D]F}|xx||z
cc<jxxt||dzj ««z
cc<ŒHDcgc]/}j
j j|jŽŒ1c}Scc}w)zžAccumulate tok2vec loss and gradient. This is passed as a callback
to all but the last listener. Only the last one does the backprop.
r) ÚrangerMrÚfloatÚsumrrUÚalloc2fr\)Ú
one_d_tokvecsÚt2vÚ d_tokvecsrcr+rYs €€€€r$Úaccumulate_gradientz+Tok2Vec.update.<locals>.accumulate_gradientªø€ô
œ3˜}Ó.ؘ!“  
¨aÑ 0Ñ0“ Øt—y!¤U¨M¸!Ñ,<ÀÑ,A×+FÑ+FÓ+HÓ%IÑCJÓJÁ'¸*D—JJ—NN×*¨C¯I©IÒ6À'Ñ JùÒJsÁ#4BcóN|««}j«|S)z>Callback to actually do the backprop. Passed to last listener.)Ú
finish_update)rkÚd_docsroÚ
bp_tokvecsrnr+rbs €€€€€r$Úbackpropz Tok2Vec.update.<locals>.backprop´s-ø€á  
Ô   Ó*ˆˆØ×" ˆMr&éÿÿÿÿ)rÚ predictedrrÚ begin_updaterUrjr\r9rr-Ú get_batch_idr2Úreceive)r+rdrarbrcÚegrJrmrtÚbatch_idr6rorsrnrYs` `` @@@@r$ÚupdatezTok2Vec.updatesþ€ð( ˆˆ˜(Ð$4Ô5Ù'/Ó0¡x   xˆÐ˜Ÿ TÔ"Ÿj™j×5°dÓˆÙCJÓKÁ7¸CÐ+T—ZZ—^^×+¨S¯Y©YÒ7À7ÑKˆ Ø×ј$Ÿ)™) SÔ K÷ ð ôÓØŸ s¨Ó+ˆHØ × Ñ ˜X wÐ0CÕ  >Š>Ø N‰N˜ × & ¸ ˆ
ùò;1ùòLs œD7Á)4D<cóyr"©)r+rdÚscoress r$Úget_losszTok2Vec.get_lossÃsØ r&)rÚ get_examplesrcót|d«g}t|«d«D]}|j|j«Œ|s/Jtj
j
|j¬««|jj|¬«y)atInitialize the pipe for training, using a representative set
of data examples.
get_examples (Callable[[], Iterable[Example]]): Function that
returns a representative sample of gold-standard Example objects.
nlp (Language): The current nlp object the component is part of.
DOCS: https://spacy.io/api/tok2vec#initialize
zTok2Vec.initializeé
©r)ÚXN)
rrr:ÚxrÚE923ÚformatrrÚ
initialize)r+rrÚ
doc_sampleÚexamples r$r‰zTok2Vec.initializeÆsqô ˜lÐ,@Ôˆ
Ü™l›n¨bÖ1ˆGØ × Ñ ˜gŸi™iÕ Ð=œ6Ÿ;™;×-°4·9±9Ð=ˆ
×Ñ 
ÐÕ+r&cótr")ÚNotImplementedError)r+Úlabels r$Ú add_labelzTok2Vec.add_labelÜsÜ!r&)r)rN) Ú__name__Ú
__module__Ú __qualname__Ú__doc__rr Ústrr,Úpropertyrr2r/r<Úboolr?rIrrrWr
r_rrhr r
rr|r€rrr‰rr~r&r$r r "sXñ
ñ&˜&¨Eð&¸ð&ÈTó&ð"ðT˜4Ð 1ÑTóðTð ð. d¨3¡iòð ?Ð%6ðð?ÐPTó Ð(9ð È3ð ÐSWó ó
˜H S™Móð
! H¨S¡Mð
ó
Ø#'Ø-1ò
4à˜7Ñ4ðð 4ð

ð 4𠘘c 5˜jÑ
4ól
ð#'ò ˜r 8¨GÑ#4Ð
ó ,ó,"r&códeZdZdZdZdededdfdZede e
defd „«Z d
eddfd Z de