Files
INTUIA/Programa final/spacy/pipeline/__pycache__/entityruler.cpython-312.pyc
T

193 lines
26 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
>û glTãóìddlZddlmZddlmZddlmZmZmZm Z m
Z
m Z m Z m
Z
mZddlZddlmZmZddlmZddlmZmZdd lmZdd
lmZdd lmZmZdd lm Z dd
l!m"Z"m#Z#m$Z$m%Z%m&Z&ddl'm(Z(dZ)ee*ee*e
ee*efffZ+ejXdgd¢dddidde)ddidœdddddœ¬«dede*de ee-e*fd ed!e.d"e.d#e*d$e efd%„«Z/d&„Z0e%jbd«d'„«Z2Gd(„d)e(«Z3y)*éN)Ú defaultdict)ÚPath) ÚAnyÚCallableÚDictÚIterableÚListÚOptionalÚSequenceÚTupleÚUnioné)ÚErrorsÚWarnings)ÚLanguage)ÚMatcherÚ
PhraseMatcher)Úlevenshtein_compare©Ú get_ner_prf)ÚDocÚSpan)ÚExample)ÚSimpleFrozenListÚ ensure_pathÚ from_diskÚregistryÚto_diské)ÚPipez||Ú entity_ruler)zdoc.entsztoken.ent_typez
token.ent_iobz@misczspacy.levenshtein_compare.v1Fz@scorerszspacy.entity_ruler_scorer.v1©Úphrase_matcher_attrÚmatcher_fuzzy_compareÚvalidateÚoverwrite_entsÚ
ent_id_sepÚscorergð?g)Úents_fÚents_pÚents_rÚ
ents_per_type)ÚassignsÚdefault_configÚdefault_score_weightsÚnlpÚnamer#r$r%r&r'r(c
ó(t||||||||¬«S)Nr")Ú EntityRuler)r0r1r#r$r%r&r'r(s ú[C:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/pipeline/entityruler.pyÚmake_entity_rulerr5s)ô8 Ø Ø ØØØô ð óc ót|«S©Nr)ÚexamplesÚkwargss r4Úentity_ruler_scorer;>s
Ü  Ð r6cótSr8)r;©r6r4Úmake_entity_ruler_scorerr>Bsä Ðr6cóxeZdZdZ d-deddededœdedede e
e efde d e
d
e
d ed e eed
e e ddfdZde fdZdede
fdZdedefdZdefdZdZedeedffd«Zdddœde geefde ed e eefdZedee edffd«Zedeefd«Zd eeddfdZ d.dZ!d eddfd!„Z"d.d"„Z#dedeee effd#„Z$de%d e%defd$„Z&e'«d%œd&e(d'eeddfd(„Z)e'«d%œd'eede(fd)„Z*e'«d%œd*e
ee+fd'eeddfd+„Z,e'«d%œd*e
ee+fd'eeddfd,„Z-y)/r3The EntityRuler lets you add spans to the `Doc.ents` using token-based
rules or exact phrase matches. It can be combined with the statistical
`EntityRecognizer` to boost accuracy, or used on its own to implement a
purely rule-based entity recognition system. After initialization, the
component is typically added to the pipeline using `nlp.add_pipe`.
DOCS: https://spacy.io/api/entityruler
USAGE: https://spacy.io/usage/rule-based-matching#entityruler
NF)r#r$r%r&r'Úpatternsr(r0r1r#r$r%r&r'r@r(Úreturncó®||_||_||_tt«|_tt«|_||_||_t|j||j¬«|_ ||_ t|j|j|¬«|_||_tt «|_||j%|«| |_y)a#Initialize the entity ruler. If patterns are supplied here, they
need to be a list of dictionaries with a `"label"` and `"pattern"`
key. A pattern can either be a token pattern (list) or a phrase pattern
(string). For example: `{'label': 'ORG', 'pattern': 'Apple'}`.
nlp (Language): The shared nlp object to pass the vocab to the matchers
and process phrase patterns.
name (str): Instance name of the current pipeline component. Typically
passed in automatically from the factory when the component is
added. Used to disable the current entity ruler while creating
phrase patterns with the nlp object.
phrase_matcher_attr (int / str): Token attribute to match on, passed
to the internal PhraseMatcher as `attr`.
matcher_fuzzy_compare (Callable): The fuzzy comparison method for the
internal Matcher. Defaults to
spacy.matcher.levenshtein.levenshtein_compare.
validate (bool): Whether patterns should be validated, passed to
Matcher and PhraseMatcher as `validate`
patterns (iterable): Optional patterns to load in.
overwrite_ents (bool): If existing entities are present, e.g. entities
added by the model, overwrite them by matches if necessary.
ent_id_sep (str): Separator used internally for entity IDs.
scorer (Optional[Callable]): The scoring method. Defaults to
spacy.scorer.get_ner_prf.
DOCS: https://spacy.io/api/entityruler#init
©r%Ú
fuzzy_compare©Úattrr%N)r0r1Ú overwriterÚlistÚtoken_patternsÚphrase_patternsÚ _validater$rÚvocabÚmatcherr#rÚphrase_matcherr'ÚtupleÚ_ent_idsÚ add_patternsr()
Úselfr0r1r#r$r%r&r'r@r(s
r4Ú__init__zEntityRuler.__init__Rs¸ðPˆŒØˆŒ ØŒÜ)¬$ÓÔÜ*¬4ÓÔØŒØ%:ˆÔØ I‰I ¸×8RÑ8Rô
ˆŒ ð$7ˆÔ Ü I‰I˜D×4¸xô
ˆÔðŒÜ#¤EÓŒ
Ø Ð Ø × Ñ ˜hÔ ˆ r6có´td|jj«D««}td|jj«D««}||zS)z5The number of all patterns added to the entity ruler.c3ó2K|]}t|«Œy­wr8©Úlen©Ú.0Úps r4ú <genexpr>z&EntityRuler.__len__.<locals>.<genexpr>sèø€ÐLÑ/K¨!œs 1ŸvÑ/Kùóc3ó2K|]}t|«Œy­wr8rVrXs r4r[z&EntityRuler.__len__.<locals>.<genexpr>sèø€ÐNÑ0M¨1¤ Ñ0Mùr\)ÚsumrIÚvaluesrJ)rRÚn_token_patternsÚn_phrase_patternss r4Ú__len__zEntityRuler.__len__ŽsNäÑL¨t×/BÑ/B×/IÑ/IÔ/KÓÜÑ×0DÑ0D×0KÑ0KÔ0MÓØÐ"3Ñ3r6Úlabelcó>||jvxs||jvS)z+Whether a label is present in the patterns.)rIrJ)rRrcs r4Ú __contains__zEntityRuler.__contains__”s#à˜×L¨u¸×8LÑ8LÐ/LÐLr6ÚdoccóÆ|j«} |j|«}|j||«|S#t$r }||j||g|«cYd}~Sd}~wwxYw)zæFind matches in document and add them as entities.
doc (Doc): The Doc object in the pipeline.
RETURNS (Doc): The Doc with added entities, if available.
DOCS: https://spacy.io/api/entityruler#call
N)Úget_error_handlerÚmatchÚset_annotationsÚ Exceptionr1)rRrfÚ
error_handlerÚmatchesÚes r4Ú__call__zEntityRuler.__call__˜s`ð×
ð—j‘j “oˆGØ × Ñ   gÔ ˆJøÜò  §¡¨D°3°%¸Ó ;ûð <ús$7· A ÁAÁA ÁA c
ó”|j«tj«5tjdd¬«t |j |««t |j
|««z}ddd«tDcgc]\}}}||k7sŒ
|||fŒc}}}«}d}t||d¬«}|S#1swYŒFxYwcc}}}w)ignorez\[W036)Úmessagecó$|d|dz
|d fS)Nrrr=)Úms r4ú<lambda>z#EntityRuler.match.<locals>.<lambda>±s ! A¡$¨¨1©¡+°°!±¨uÑ!5r6T)ÚkeyÚreverse) Ú_require_patternsÚwarningsÚcatch_warningsÚfilterwarningsrHrMrNÚsetÚsorted)rRrfrmÚm_idÚstartÚendÚ
final_matchesÚ get_sort_keys r4rizEntityRuler.match¨Ø ×ÑÔ Ü
×
× # H°iÕ ˜4Ÿ<™<¨Ó-´°T×5HÑ5HÈÓ5MÓ0NÑNˆÙ8?Õ Ñ$4 D¨%°À5ÈCÃ<ˆdE˜
¸Ó 
ˆ
ñ ܘ}°,ÈÔ
ØÐ÷
&üô
Qs¥A
B7ÂC ÂC Â7Ccó t|j«}g}t«}|D\}}}td|||D««r
|jsŒ)||vsŒ.|dz
|vsŒ6||j
vr#|j
|\} }
t
|||| |
¬«} nt
||||¬«} |j| «|D cgc]#} | j|kr| j|kDrŒ"| Œ%}} |jt||««ŒÑ||z|_ycc} w)zModify the document in placec3ó4K|]}|jŒy­wr8)Úent_type)rYÚts r4r[z.EntityRuler.set_annotations.<locals>.<genexpr>»sèø€Ð6¡~ !1—:•:¡~ùsr)rcÚspan_id)rcN) rHÚentsr|ÚanyrGrPrÚappendrr€ÚupdateÚrange)
rRrfrmÚentitiesÚ new_entitiesÚ seen_tokensÚmatch_idrr€rcÚent_idÚspanrns
r4rjzEntityRuler.set_annotationsµs䘟“>ˆØˆ Ü“eˆ Û$+Ñ ˆHe˜SÜÑ6 s¨5°¡~Ó6¸t¿~º~Øà˜'¨C°!©G¸;Ò,FؘtŸ}™}Ñ,Ø$(§M¡M°(Ñ$;ME˜  U¨C°uÀfÔM  U¨C°xÔ@×# Ù'˜·±¸#²
À!Ç%Á%È%Ã-’A˜xððð×"¤5¨°Ó#4Õ%,ð˜lÑ*ˆùò s Â/#DÃD.cónt|jj««}|j|jj««t«}|D]G}|j
|vr&|j
|«\}}|j|«Œ7|j|«ŒItt|««S)z”All labels present in the match patterns.
RETURNS (set): The string labels.
DOCS: https://spacy.io/api/entityruler#labels
)
r|rIÚkeysrrJr'Ú _split_labelÚaddrOr})rRr”Ú
all_labelsÚlrcÚ_s r4ÚlabelszEntityRuler.labelsËs“ô4×&×.ˆØ “Uˆ
ãˆAØ !Ñ×,¨QÓ/‘˜uÕ˜qÕ ô ”V˜(r6)r0r@Ú get_examplescóL|j«|r|j|«yy)aInitialize the pipe for training.
get_examples (Callable[[], Iterable[Example]]): Function that
returns a representative sample of gold-standard Example objects.
nlp (Language): The current nlp object the component is part of.
patterns Optional[Iterable[PatternType]]: The list of patterns.
DOCS: https://spacy.io/api/entityruler#initialize
N)ÚclearrQ)rRrr0r@s r4Ú
initializezEntityRuler.initializeßs#ð
Œ Ù Ø × Ñ ˜  r6có:t|jj««}|j|jj««t«}|D]6}|j
|vsŒ|j
|«\}}|j|«Œ8t|«S)z¬All entity ids present in the match patterns `id` properties
RETURNS (set): The string entity ids.
DOCS: https://spacy.io/api/entityruler#ent_ids
) r|rIr”rrJr'r•rrO)rRr”Ú all_ent_idsr˜r™rs r4Úent_idszEntityRuler.ent_idsósô4×&×.ˆØ “eˆ ãˆAØ !Ò ×-¨aÓ0‘  Õô!r6có†g}|jj«D]=\}}|D]3}|j|«\}}||dœ}|r||d<|j|«Œ5Œ?|jj«D]G\}}|D]=}|j|«\}}||j
dœ}|r||d<|j|«Œ?ŒI|S)zÃGet all patterns that were added to the entity ruler.
RETURNS (list): The original patterns, one dictionary per pattern.
DOCS: https://spacy.io/api/entityruler#patterns
©rcÚpatternÚid)rIÚitemsr•rJÚtext)rRÚ all_patternsrcr@Ú ent_labelrrZs r4r@zEntityRuler.patternss×ðˆ Ø:‰OˆE#Ø$(×$5Ñ$5°eÓ$<Ñ! ˜'°GÑ<ÙØ$Ad×#   $×3×;‰OˆE#Ø$(×$5Ñ$5°eÓ$<Ñ! ˜'°G·L±LÑAÙØ$Ad×#  Ðr6cóH d}t|jj«D]\}\}}||k(sŒ|}n|jj|dDcgc]}|Œ}}|jj |¬«5g}g}g} g}
|D} t
| dt«rI|j| d«| j| d«|
j| jd««Œ_t
| dt«sŒs|j| «Œ…g} t||jj| «|
«D]#\}
}}|
|dœ}|r||d<| j|«Œ%|| zD]} | d}
d| vrF|
}|j|
| d«}
|jj|
«}|| df|j |<| d}t
|t"«r<|j$|
j|«|j&j)|
|g«Œ¤t
|t«r<|j*|
j|«|jj)|
|g«Œðt t,j.j1|¬«« ddd«ycc}w#t$rg}YŒ7wxYw#1swYyxYw) a~Add patterns to the entity ruler. A pattern can either be a token
pattern (list of dicts) or a phrase pattern (string). For example:
{'label': 'ORG', 'pattern': 'Apple'}
{'label': 'GPE', 'pattern': [{'lower': 'san'}, {'lower': 'francisco'}]}
patterns (list): The patterns to add.
DOCS: https://spacy.io/api/entityruler#add_patterns
éÿÿÿÿN)Údisabler¤rc))Ú enumerater0ÚpipelineÚ
pipe_namesÚ
ValueErrorÚ select_pipesÚ
isinstanceÚstrrŠÚgetrHÚzipÚpipeÚ
_create_labelrMÚ_normalize_keyrPrrJrNrrIrÚE097Úformat)rRr@Ú
current_indexÚir1Úsubsequent_pipesrIÚphrase_pattern_labelsÚphrase_pattern_textsÚphrase_pattern_idsÚentryrJrcrÚphrase_patternr©rvs r4rQzEntityRuler.add_patternsðˆMÜ#,¨T¯X©X×->Ñ->Ö#?<D˜˜4“<Ø$%ð$@ð26·±×1DÑ1DÀ]À^Ñ1TÓUÑ1T¨¢Ð1TÐ ÐX‰X×
"Ð+;Ð
ˆNØ$&Ð !Ø#%Ð Ø!#Ð Û!ܘe IÑÔ°w±Ô°iÑ0@Ô&×-¨e¯i©i¸«oÕ  iÑ 0´"×)¨%Õ
!ˆOÜ*-Ø
Ð+Ñ&w ð
,1¸WÑ!EÙØ+1N ×& +ð(¨/Õ9ؘg™Ø˜5‘=Ø % ×.¨u°e¸D±kÓBŸ,™,×5°eÓ<CØ*3°U¸4±[Ð)AD—MM   Ñ*ܘg¤sÔ×Ñ/×6°wÔ×'×+¨E°G°9Õ ¬Ô×Ñ.×5°gÔ—LL×$ U¨W¨IÕ$¤V§[¡[×%7Ñ%7ÀÐ%7Ó%HÓ:÷-
<ùò VøÜò ð "úç
<ús</J² JÁ JÁJÁ9A=JÃ7FJÊJÊ JÊJÊJ!cóhtt«|_tt«|_tt«|_t
|jj|j|j¬«|_ t|jj|j|j¬«|_y)zReset all patterns.rCrEN)rrHrIrJrOrPrr0rLrKr$rMrr#rN©rRs r4rzEntityRuler.clearZswä)¬$ÓÔÜ*¬4ÓÔÜ#¤EÓŒ
ÜØ H‰HN‰NØ—^‘^Ø×
ˆŒ ô
 H‰HN‰N ×!9Ñ!9ÀDÇNÁNô
ˆÕr6rc óþ|jj«Dcgc]\}}||k(sŒ ||fŒ}}}|s5ttjj d||j ¬««|Dcgc]\}}|j||«Œ}}}tt|jj«Dcic] \}}||vr||Œc}}«|_
tt|jj«Dcic] \}}||vr||Œc}}«|_ |D]G}||jvr|jj|«Œ-|jj|«ŒIycc}}wcc}}wcc}}wcc}}w)zÙRemove a pattern by its ent_id if a pattern with this ent_id was added before
ent_id (str): id of the pattern to be removed
RETURNS: None
DOCS: https://spacy.io/api/entityruler#remove
ÚID)Ú attr_typercÚ componentN)rPr_rÚE1024rºr1rrHrJrIrNÚremoverM)rRrrcÚeidÚlabel_id_pairsÚcreated_labelsÚvals r4zEntityRuler.removehð.2¯]©]×-AÑ-AÔ-Cô
Ù-C™\˜e SÀsÈfÃ}ˆUCŠLÐ-Cð ñ
ñÜÜ ×#¨d¸&ÈDÏIÉIÐð
ñ@Nô
Ù?M©|°°sˆD× Ñ ˜u cÕ *¸~ð ñ
ô ð%)×$8Ñ$8×$>Ñ$>Ô$@ô
á$@‘LU˜ Ñs