Files
INTUIA/Programa final/spacy/pipeline/__pycache__/lemmatizer.cpython-312.pyc
T

142 lines
16 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
>û gâ0ãó°ddlZddlmZddlmZmZmZmZmZm Z m
Z
m Z ddl m
Z
ddlmZddlmZmZddlmZdd lmZmZdd
lmZdd lmZmZdd lmZdd
lmZm Z m!Z!ddl"m#Z#ddl$m%Z%ejLddgdddddidœddi¬«dede e
de'de'de(d e ef d!„«Z)d"eed#ee'effd$„Z*e!jVd«d%„«Z,Gd&„d'e%«Z-y)(éN)ÚPath)ÚAnyÚCallableÚDictÚIterableÚListÚOptionalÚTupleÚUnion)ÚModelé)Úutil)ÚErrorsÚWarnings)ÚLanguage)ÚLookupsÚ load_lookups)ÚScorer)ÚDocÚToken)ÚExample)ÚSimpleFrozenListÚloggerÚregistry)ÚVocabé)ÚPipeÚ
lemmatizerz token.lemmaÚlookupFz@scorerszspacy.lemmatizer_scorer.v1)ÚmodelÚmodeÚ overwriteÚscorerÚ lemma_accgð?)ÚassignsÚdefault_configÚdefault_score_weightsÚnlpr Únamer!r"r#có8t|j|||||¬«S)r!r"r#)Ú
LemmatizerÚvocab)r(r r)r!r"r#s úZC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/pipeline/lemmatizer.pyÚmake_lemmatizerr/s"ô& Ø 5˜$ T°YÀvô ðóÚexamplesÚreturnc ó0tj|dfi|¤ŽS)lemma)rÚscore_token_attr)r1Úkwargss r.Úlemmatizer_scorer7+sÜ × " 8¨WÑ ?¸Ñ ?r0cótS©N)r7©r0r.Úmake_lemmatizer_scorerr;/sä Ðr0cóøeZdZdZededeeeeeffd«Z d%dde dœde
d e e d
eded e
d e edd
fdZed«ZdedefdZ d&d
d
dœde egeefde ede efdZej4fdedd
fdZdedeefdZdedeefdZdede
fdZe «dœde!ee"fdeefd „Z#e «dœde!ee"fdeeddfd!„Z$e «dœdeede%fd"„Z&e «dœd#e%deeddfd$„Z'y
)'r,z
The Lemmatizer supports simple part-of-speech-sensitive suffix rules and
lookup tables.
DOCS: https://spacy.io/api/lemmatizer
r!r2có6|dk(rdggfS|dk(rdgddgfSggfS)a Returns the lookups configuration settings for a given mode for use
in Lemmatizer.load_lookups.
mode (str): The lemmatizer mode.
RETURNS (Tuple[List[str], List[str]]): The required and optional
lookup tables for this mode.
rÚ lemma_lookupÚruleÚ lemma_rulesÚ lemma_excÚ lemma_indexr:)Úclsr!s r.Úget_lookups_configzLemmatizer.get_lookups_config<s<ð  Ø$ bÐ
VŠ^Ø"O k°=Ð%AÐ Bˆxˆr0rFr+r-r r)r"r#NcóÀ||_||_||_||_t «|_||_d|_|jdk(r|j|_
nv|jdk(r|j|_
nU|jd}t||«s)ttjj!|¬««t#||«|_
i|_||_y)a&Initialize a Lemmatizer.
vocab (Vocab): The vocab.
model (Model): A model (not yet implemented).
name (str): The component name. Defaults to "lemmatizer".
mode (str): The lemmatizer mode: "lookup", "rule". Defaults to "lookup".
overwrite (bool): Whether to overwrite existing lemmas. Defaults to
`False`.
scorer (Optional[Callable]): The scoring method. Defaults to
Scorer.score_token_attr for the attribute "lemma".
DOCS: https://spacy.io/api/lemmatizer#init
Frr?Ú
_lemmatize)r!N)r-r r)Ú_moderÚlookupsr"Ú
_validatedr!Úlookup_lemmatizeÚ lemmatizeÚrule_lemmatizeÚhasattrÚ
ValueErrorrÚE1003ÚformatÚgetattrÚcacher#)Úselfr-r r)r!r"r#Ú mode_attrs r.Ú__init__zLemmatizer.__init__Kð.ˆŒ
؈Œ
؈Œ ؈Œ
Ü“yˆŒ ،؈ŒØ 9‰9˜Ò Ø2ˆD
Y‰Y˜
Ø0ˆDŸ9™9˜+ ZÐ0ˆIܘ4 Ô ¤§¡×!4Ñ!4¸$Ð!4Ó!?Ó$ T¨9Ó5ˆDŒN؈Œ
؈ r0có|jSr9)rG)rSs r.r!zLemmatizer.modeus àz‰zÐr0ÚdoccóN|js|jtj«|j «} |D]7}|j
s|j dk(sŒ|j|«d|_Œ9|S#t$r }||j||g|«Yd}~yd}~wwxYw)z´Apply the lemmatizer to one document.
doc (Doc): The Doc to process.
RETURNS (Doc): The processed Doc.
DOCS: https://spacy.io/api/lemmatizer#call
rN) rIÚ_validate_tablesrÚE1004Úget_error_handlerr"r4rKÚlemma_Ú Exceptionr))rSrWÚ
error_handlerÚtokenÚes r.Ú__call__zLemmatizer.__call__ysðŠØ × !¤&§,¡,Ô ×
ðØ—>> U§[¡[°AÓ%5Ø#'§>¡>°%Ó#8¸Ñ#;E•LððˆJøÜò ˜$Ÿ)™) T¨C¨5°!× 4ûð 5ús½ A;ÁA;Á; B$ÂBÂB$)r(rHÚ get_examplesr(rHcó¤|j|j«\}}|€Štjd«t |j
j |¬«}t |j
j |d¬«}|jD]#}|j||j|««Œ%||_
|jtj«y)Initialize the lemmatizer and load in data.
get_examples (Callable[[], Iterable[Example]]): Function that
returns a representative sample of gold-standard Example objects.
nlp (Language): The current nlp object the component is part of.
lookups (Lookups): The lookups object containing the (optional) tables
such as "lemma_rules", "lemma_index", "lemma_exc" and
"lemma_lookup". Defaults to None.
Nz2Lemmatizer: loading tables from spacy-lookups-data)ÚlangÚtablesF)rdreÚstrict)rDr!rÚdebugrr-rdreÚ set_tableÚ get_tablerHrYrrZ)rSrbr(rHÚrequired_tablesÚoptional_tablesÚoptional_lookupsÚtables r.Ú
initializezLemmatizer.initializeŒð ,0×+BÑ+BÀ4Ç9Á9Ó+MÑ(ˆ˜Ø ˆ L‰LÐ ¯
©
¯©ÀÔPˆGÜ—Z‘Z—_‘_¨_ÀUô Ð ð0Ø×! %Ð)9×)CÑ)CÀEÓ)JÕˆŒ Ø ×ÑœfŸl™lÕ+r0Ú
error_messagecóî|j|j«\}}|D]K}||jvsŒt|j |j||jj
¬««d|_y)z8Check that the lookups are correct for the current mode.)r!reÚfoundTN)rDr!rHrNrPrerI)rSrorjrkrms r.rYzLemmatizer._validate_tables¨sqà+/×+BÑ+BÀ4Ç9Á9Ó+MÑ(ˆ˜Û$ˆEؘDŸL™LÒ Ø!ŸY™YØ"Ÿl™l×óððˆr0r_có°|jjdi«}|j|j|j«}t |t
«r|g}|S)zÞLemmatize using a lookup-based approach.
token (Token): The token to lemmatize.
RETURNS (list): The available lemmas for the string.
DOCS: https://spacy.io/api/lemmatizer#lookup_lemmatize
r>)rHriÚgetÚtextÚ
isinstanceÚstr)rSr_Ú lookup_tableÚresults r.rJzLemmatizer.lookup_lemmatize¶sJð—|‘|×-¨n¸bÓ Ø×! %§*¡*¨e¯j©jÓ9ˆÜ fœcÔ Xˆˆ
r0cót|j|j|jjf}||jvr|j|S|j
}|j j«}|dvr9|dk(r#tjtj«|j«gS|j|«r|j«gS|jjdi«}|jjdi«}|jjdi«}t|j!|«|j!|«|j!|«f«s|dk(r|gS|j«gS|j!|i«}|j!|i«} |j!|i«}
|} |j«}g} g}
|
D]n\}}|j#|«sŒ|dt%|«t%|«z
|z}|sŒ8||vs|j'«s| j)|«Œ^|
j)|«Œpt+t,j/| ««} | j!|g«D]}|| vsŒ| j1d|«Œ| s| j3|
«| s| j)| «| |j|<| S) zÚLemmatize using a rule-based approach.
token (Token): The token to lemmatize.
RETURNS (list): The available lemmas for the string.
DOCS: https://spacy.io/api/lemmatizer#rule_lemmatize
)ÚÚeolÚspacerzrBrAr@ÚpropnNr)ÚorthÚposÚmorphÚkeyrRrtÚpos_ÚlowerÚwarningsÚwarnrÚW108Ú is_base_formrHriÚanyrsÚendswithÚlenÚisalphaÚappendÚlistÚdictÚfromkeysÚinsertÚextend)rSr_Ú cache_keyÚstringÚuniv_posÚ index_tableÚ exc_tableÚ rules_tableÚindexÚ
exceptionsÚrulesÚorigÚformsÚ oov_formsÚoldÚnewÚforms r.rLzLemmatizer.rule_lemmatizeÄs\ð—Z‘Z §¡¨E¯K©K¯O©OÐ<ˆ Ø ˜Ÿ
Ñ —:‘:˜iÑ ˆØ—:‘:ר Ð ˜2Š~Ü
œhŸm™mÔ—L‘L“NÐ × Ñ ˜UÔ —L‘L“NÐ —l‘l×,¨]¸BÓ Ø—L‘L×*¨;¸Ó;ˆ Ø—l‘l×,¨]¸BÓ Üà Ó
˜hÓ Ó
ô
ð˜xàŸ Ð ¨"Ó-ˆØ—]] 8¨RÓ
Ø ¨"ÓØˆØˆØˆØˆ Û‰HˆCؘsÕРF£ ¬c°#«hÑ 6Ð7¸#Ñ=ÙØØ˜U‘]¨$¯,©,¬.Ø—LL Õ×$ ô”T—]‘] 5Óð
—N‘N 6¨2Ö.ˆDؘ5Ò Ø ˜Q ÕØ L‰L˜Ô Ø L‰L˜Ô Ø %ˆ
؈ r0cóy)aCheck whether the token is a base form that does not need further
analysis for lemmatization.
token (Token): The token.
RETURNS (bool): Whether the token is a base form.
DOCS: https://spacy.io/api/lemmatizer#is_base_form
Fr:)rSr_s r.r‡zLemmatizer.is_base_formsðr0©ÚexcludeÚpathr£có\i}ˆˆfd|d<ˆfd|d<tj||«y)zÞSerialize the pipe to disk.
path (str / Path): Path to a directory.
exclude (Iterable[str]): String names of serialization fields to exclude.
DOCS: https://spacy.io/api/lemmatizer#to_disk
có>jj|¬«S©Nr¢)r-Úto_disk©Úpr£rSs €€r.ú<lambda>z$Lemmatizer.to_disk.<locals>.<lambda>sø€ t§z¡z×'9Ñ'9¸!ÀWÐ'9Ô'Mr0r-có:jj|«Sr9)rH©rSs €r.z$Lemmatizer.to_disk.<locals>.<lambda>sø€¨¯©×)=Ñ)=¸aÔ)@r0rHN)r)rSÚ serializes` ` r.zLemmatizer.to_disks.ù€ðˆ ÜMˆ Û@ˆ Ü T˜9 .r0có~i}ˆˆfd|d<ˆfd|d<tj||«j«S)aHLoad the pipe from disk. Modifies the object in place and returns it.
path (str / Path): Path to a directory.
exclude (Iterable[str]): String names of serialization fields to exclude.
RETURNS (Lemmatizer): The modified Lemmatizer object.
DOCS: https://spacy.io/api/lemmatizer#from_disk
có>jj|¬«S)r-Ú from_diskr©s €€r.z&Lemmatizer.from_disk.<locals>.<lambda>-sø€¨¯©×)=Ñ)=¸Ð)=Ô)Qr0r-có:jj|«Sr9)rHr­s €r.z&Lemmatizer.from_disk.<locals>.<lambda>.sø€¨4¯<©<×+AÑ+AÀ!Ô+Dr0rH)rrY)rSÚ deserializes` ` r.zLemmatizer.from_disk!s?ù€ð8:ˆ Ü Û!Dˆ Ü t˜[¨'Ô ×ÑÔØˆ r0cózi}ˆˆfd|d<jj|d<tj|«S)zçSerialize the pipe to a bytestring.
exclude (Iterable[str]): String names of serialization fields to exclude.
RETURNS (bytes): The serialized object.
DOCS: https://spacy.io/api/lemmatizer#to_bytes
có<jj¬«S)r-Úto_bytes)rSs€€r.z%Lemmatizer.to_bytes.<locals>.<lambda><sø€ T§Z¡Z×%8Ñ%8ÀÐ%8Ô%Ir0r-rH)rHr)rSs`` r.zLemmatizer.to_bytes3s9ù€ðˆ ÜIˆ Ø#Ÿ|™|×4ˆ Ü}‰}˜Ó0r0Ú