Files
INTUIA/Programa final/spacy/training/__pycache__/augment.cpython-312.pyc
T

112 lines
14 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
?û g)5ãó0ddlZddlZddlmZddlmZmZmZmZm Z m
Z
m Z ddl m
Z
ddlmZddlmZmZerdd lmZe
j*d
«d ed ed
e
eee efdedede
e ededegeeffd«Zdddddddœddded ed ed
e
eee efdedede
e edeefdZe
j*d«deded
eee efdedegeeffd«Ze
j*d«dededegeeffd«ZdddedeefdZdddededeefd „Zdddefd!„Zddd"œddded
eee efdededeef d#„Zd$d%œddd&ed'eee efd
eee eee effde de eeee efff d(„Z!ddded)ed*e"def
d+„Z#d,„Z$y)-éN)Úpartial)Ú
TYPE_CHECKINGÚCallableÚDictÚIteratorÚListÚOptionalÚTupleé)Úregistryé)ÚExample)Ú_doc_to_biluo_tags_with_partialÚsplit_bilu_label)ÚLanguagezspacy.combined_augmenter.v1Ú lower_levelÚ
orth_levelÚ
orth_variantsÚwhitespace_levelÚwhitespace_per_tokenÚwhitespace_variantsÚreturnrc ó.tt||||||¬«S)a\Create a data augmentation callback that uses orth-variant replacement.
The callback can be added to a corpus or other data iterator during training.
lower_level (float): The percentage of texts that will be lowercased.
orth_level (float): The percentage of texts that will be augmented.
orth_variants (Optional[Dict[str, List[Dict]]]): A dictionary containing the
single and paired orth variants. Typically loaded from a JSON file.
whitespace_level (float): The percentage of texts that will have whitespace
tokens inserted.
whitespace_per_token (float): The number of whitespace tokens to insert in
the modified doc as a percentage of the doc length.
whitespace_variants (Optional[List[str]]): The whitespace token texts.
RETURNS (Callable[[Language, Example], Iterator[Example]]): The augmenter.
©rrrrrr)rÚcombined_augmenterrs úWC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/training/augment.pyÚcreate_combined_augmenterrs&ô. ÜØØØ ðógrÚnlpÚexamplec #ó’Ktj«|kr t||«}|rŠtj«|krs|j}|j«} t |j
«| dd<t
||| d|d¬«\}
} | | d<|j|j|
«| «}|rŽtj«|krwttt|j
«|z««D]J} t||tj|«tjdt|j
«««}ŒL|y­w)doc_annotationÚentitiesÚtoken_annotationF©Úlowerr)ÚrandomÚmake_lowercase_variantÚtextÚto_dictrÚ referenceÚmake_orth_variantsÚ from_dictÚmake_docÚrangeÚintÚlenÚmake_whitespace_variantÚchoiceÚ randrange)
rr rrrrrrÚraw_textÚ orig_dictÚ variant_textÚvariant_token_annotÚ_s
rrr0s*èø€ô‡}˜Ò¨gÓÙœŸ¨:Ò—<‘<ˆØ—O‘OÓ%ˆ Ü2QØ × Ñ ó3
ˆ Ð# -?Ø Ø Ø Ð Øô -
Ñ)ˆ Ð)<ˆ Ð×# C§L¡L°Ó$>À ÓJˆÙœvŸ}™}Ð1AÒ”sœ3˜1Ð4HÑJˆAÜØÜ
Ð× Ñ  ¤C¨×(9Ñ(9Ó$:Ó‰GðKð ƒMùsEEzspacy.orth_variants.v1Úlevelr&có(tt|||¬«S)aCreate a data augmentation callback that uses orth-variant replacement.
The callback can be added to a corpus or other data iterator during training.
level (float): The percentage of texts that will be augmented.
lower (float): The percentage of texts that will be lowercased.
orth_variants (Dict[str, List[Dict]]): A dictionary containing
the single and paired orth variants. Typically loaded from a JSON file.
RETURNS (Callable[[Language, Example], Iterator[Example]]): The augmenter.
)rr:r&)rÚorth_variants_augmenter)r:r&rs rÚcreate_orth_variants_augmenterr=Wsô ܨ}ÀEÐQVô ðrzspacy.lower_case.v1có$tt|¬«S)a3Create a data augmentation callback that converts documents to lowercase.
The callback can be added to a corpus or other data iterator during training.
level (float): The percentage of texts that will be augmented.
RETURNS (Callable[[Language, Example], Iterator[Example]]): The augmenter.
©r:)rÚlower_casing_augmenterr?s rÚcreate_lower_casing_augmenterrAisô ÔÔ 7rc#óK|y­w©)rr s rÚ dont_augmentrEvs èø€Ø
ƒMùsc#ó`Ktj«|k\r|yt||«y­wrC)r'r()rr r:s rr@r@zs(èø€ô‡}˜%ÒØ
ä$ S¨'Ó2ùs,.có,|j«}t|j«|dd<|j|jj ««}|jDcgc]}|j Œc}|dd<|j||«Scc}w)Nr"r#r$ÚORTH)r*rr+r.r)r&Úlower_r-)rr Ú example_dictÚdocÚts rr(r(ƒsØ—?‘?Ó$€LÜ1PØ×Ñó2€LÐ"  ,‰,w—||×
,€CØBI×BSÒBSÓ/TÑBS¸·³ÐBSÑ/T€LÐ$ VÑ × Ñ ˜S  /ùò0UsÁ#B)r:r&c#ódKtj«|k\r|y|j}|j«}t|j«|dd<t |||d||duxrtj«|k¬«\}}||d<|j
|j|«|«y­w)Nr"r#r$r%)r'r)r*rr+r,r-r.) rr rr:r&r5r6r7r8s rr<r<èø€ô‡}˜%ÒØ
à—<<ˆØ—O‘OÓ%ˆ Ü2QØ × Ñ ó3
ˆ Ð# -?Ø Ø Ø Ð Ø B¬6¯=©=«?¸UÑ+Bô -
Ñ)ˆ Ð)<ˆ Ð×Ñ § ¡ ¨\Ó :¸IÓFùsB.B0Fr%ÚrawÚ
token_dictcób|jdg«}|jdg«}|s||fS|r-|Dcgc]}|j«Œ}}|j«}|s ||d<||fS|jdg«}|D cgc]} tj| d«Œ}
} t t |««D]?} t t |««D]&} || || dvsŒ|| || dvsŒ|
| || <Œ(ŒA|jdg«}
|
D cgc]} tj| d«Œ}
} t t |««D} t t |
««D} || |
| dvsŒ|| t jj|
| d«vsŒ<tjddg«}t |
| d«d k(r|
| dj|| «}n)|
| dD]}|| |vsŒ |j|| «}Œ |
| ||| <Œ·ŒÐ||d<t|«}||fScc}wcc} wcc} w)
NrHÚTAGÚsingleÚvariantsÚtagsÚpairedrr
r ) Úgetr&r'r3r/r1Ú itertoolsÚchainÚ
from_iterableÚindexÚconstruct_modified_raw_text)rrNrOrr&ÚwordsrTÚndsvÚ
punct_choicesÚword_idxÚ punct_idxÚndpvÚpair_idxÚpairs rr,r,¨s]ð
N‰N˜6  &€EØ >‰>˜% Ó $€Dá ØÐÙ Ù$)Ó*¡E˜q EˆÐi‰i‹kˆá Ø"ˆ
ØÐà × Ñ ˜X rÓ *€DÙ;?Ó@¹4°a”V—]] 1 Z¡=Õ1¸4€MМ#˜e›*ÖÜœs 4›yÖ)ˆX $ y¡/°&Ñ"9Ò˜(O t¨I¡°zÑ'BÒBà"/° Ñ":hñ  × Ñ ˜X  *€DÙ;?Ó@¹4°a”V—]] 1 Z¡=Õ1¸4€MМ#˜e›*ÖÜœs 4›yÖ)ˆIØH‰~  i¡°Ñ!8Ò8¸UØñ>ä×.¨t°I©¸zÑ/JÓ>Lô"Ÿ=™=¨!¨Q¨Ó0ät˜I‘ vÑ/°1Ò# I™¨vÑ6×<¸TÀ(¹^ÓL!% °
Ô ;˜Ø  ™?¨dÒ2Ø'+§z¡z°%¸±/Ó'B™Hð!<ð#0° Ñ":¸8Ñ"Dhñ&ð"€JˆÜ
% 
1€CØ 
ˆ?ÐùòM+ùòAùòAs±H"Á;H'ÄH,Ú
whitespaceÚpositioncó<|j«}t|j«|dd<|jdi«}|jdi«}t |j«dk(s|d|vsxt |jdg««dkDsZt |jj
«dkDs8|jj
d«r|jj
dd¬ «s|S|jdg«}t |«}d|cxkr|ksJJ|jj
d
«rkd } |d kDrN||krI|d|d z
}
|d|} d
|
vr2d
| vr.t|
«\} }
t| «\}}| dvr|dvr
|
|k(rd|
} |dj|| «n|d=|dj||«|dj|d«|jj
d«r|dj|d«n|d=|jj
d«r|dj||«n|d=|jj
d«r|dj|d«n|d=|jj
d«r|dj|d«n|d=|jj
dd¬ «r|dk(r|dj|d«n|dj||d z