Files
INTUIA/Programa final/spacy/pipeline/__pycache__/span_ruler.cpython-312.pyc
T

237 lines
28 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
>û g‡VãóèddlZddlmZddlmZddlmZmZmZm Z m
Z
m Z m Z m
Z
mZmZmZddlZddlmZddlmZmZddlmZdd lmZmZdd
lmZdd lmZdd l m!Z!m"Z"dd
l#m$Z$ddlm%Z%m&Z&m'Z'ddl(m)Z)ee*ee*e
ee*efffZ+dZ,ejZddgdddddidddidœdddddœ¬«ded e*d!e ee.e*fd"ed#e/d$e/d%e ed&e*fd'„«Z0ejZd(d)ge,dddd*idddidd+d,e,d-œd.œ d/e,d0dd/e,d1dd/e,d2dd/e,d3di¬«ded e*d4e e*d5e ee e"e e"ge e"fd6e/d7ee e"e e"ge e"fd!e ee.e*fd"ed#e/d8e/d%e efd9„«Z1d:e e"d;e e"d<e
e"fd=„Z2e'jfd>«d?„«Z4d:e e"d;e e"d<e
e"fd@„Z5e'jfdA«dB„«Z6e,dCœdDe e$d<ee*effdE„Z7e'jpd,«e,fd4e*fdF„«Z9GdG„dHe)«Z:y)IéN)Úpartial)ÚPath) ÚAnyÚCallableÚDictÚIterableÚListÚOptionalÚSequenceÚSetÚTupleÚUnionÚcasté)Úutil)ÚErrorsÚWarnings)ÚLanguage)ÚMatcherÚ
PhraseMatcher)Úlevenshtein_compare)ÚScorer)ÚDocÚSpan)ÚExample)ÚSimpleFrozenListÚ ensure_pathÚregistryé)ÚPipeÚrulerÚfuture_entity_rulerzdoc.entsFú@scorerszspacy.entity_ruler_scorer.v1Ú
__unused__z@misczspacy.levenshtein_compare.v1)Úphrase_matcher_attrÚvalidateÚoverwrite_entsÚscorerÚ
ent_id_sepÚmatcher_fuzzy_comparegð?g)Úents_fÚents_pÚents_rÚ
ents_per_type)ÚassignsÚdefault_configÚdefault_score_weightsÚnlpÚnamer%r*r&r'r(r)c
óL|rt}nt}t||ddd||||d|¬« S)NTF© Ú spans_keyÚ spans_filterÚ
annotate_entsÚ ents_filterr%r*r&Ú overwriter()Úprioritize_new_ents_filterÚprioritize_existing_ents_filterÚ SpanRuler) r2r3r%r*r&r'r(r)r9s úZC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/pipeline/span_ruler.pyÚmake_entity_rulerr?#s?ñ8Ü0‰ ä Ü Ø Ø ØØØØØØØô ð óÚ
span_rulerz doc.spansz#spacy.first_longest_spans_filter.v1Tz)spacy.overlapping_labeled_spans_scorer.v1)r#r6r5Úspans_Ú_fÚ_pÚ_rÚ _per_typer6r7r8r9r:c
ó.t|||||||||| |
¬« S)Nr5)r=) r2r3r6r7r8r9r%r*r&r:r(s r>Úmake_span_rulerrHRs3ôJ Ø Ø ØØØØØô ð r@ÚentitiesÚspansÚreturncó„d}t||d¬«}t|«}g}t«Š|D]‰}|j}|j}t ˆfd|D««sŒ0|j
|«|Dcgc]#}|j|kr|j|kDrŒ"|Œ%}}jt||««Œ‹||zScc}w)Merge entities and spans into one list without overlaps by allowing
spans to overwrite any entities that they overlap with. Intended to
replicate the overwrite_ents=True behavior from the EntityRuler.
entities (Iterable[Span]): The entities, already filtered for overlaps.
spans (Iterable[Span]): The spans to merge, may contain overlaps.
RETURNS (List[Span]): Filtered list of non-overlapping spans.
cóN|j|jz
|j fS©ÚendÚstart©Úspans r>ú<lambda>z,prioritize_new_ents_filter.<locals>.<lambda>ó §¡¨D¯J©JÑ!6¸¿¹¸ Ñ Dr@ÚkeyÚreversec3ó:K|]}|jvŒy­wrN©ÚÚ.0ÚtokenÚ seen_tokenss €r>ú <genexpr>z-prioritize_new_ents_filter.<locals>.<genexpr>™óøèø€Ð<±t¨eˆuw‰w˜kÔ)±tùóƒ) ÚsortedÚlistÚsetrQrPÚallÚappendÚupdateÚrange) rIrJÚ get_sort_keyÚ new_entitiesrSrQrPÚer_s @r>r;r;ø€ñE€LÜ 5˜l°DÔ 9€EÜH‹~€HØ€LÜ›E€KÛˆØ
ˆØh‰hˆÜ Ó<±tÓ × Ñ  Ô %Ù#+ÓU¡8˜a°A·G±G¸c²MÀaÇeÁeÈeÃmš 8ˆHÐ × Ñ œu U¨CÓ
ð  "ùòVs Á1#B=ÂB=z#spacy.prioritize_new_ents_filter.v1cótSrN)r;©r@r>Úmake_prioritize_new_ents_filterro sä %r@cóJd}t||d¬«}t|«}g}t«Šjd|D«Ž|D][}|j}|j
}t
ˆfd|D««sŒ0|j|«jt||««Œ]||zS)Merge entities and spans into one list without overlaps by prioritizing
existing entities. Intended to replicate the overwrite_ents=False behavior
from the EntityRuler.
entities (Iterable[Span]): The entities, already filtered for overlaps.
spans (Iterable[Span]): The spans to merge, may contain overlaps.
RETURNS (List[Span]): Filtered list of non-overlapping spans.
cóN|j|jz
|j fSrNrOrRs r>rTz1prioritize_existing_ents_filter.<locals>.<lambda>°rUr@TrVc3ó\K|]$}t|j|j«Œ&y­wrN)rirQrP)r]Úents r>r`z2prioritize_existing_ents_filter.<locals>.<genexpr>µs èø€ÐG¹h°sœ˜sŸy™y¨#¯'©'×2¹hùs*,c3ó:K|]}|jvŒy­wrNrZr\s €r>r`z2prioritize_existing_ents_filter.<locals>.<genexpr>¹rarb) rcrdrerhrQrPrfrgri)rIrJrjrkrSrQrPr_s @r>r<r<¥ø€ñE€LÜ 5˜l°DÔ 9€EÜH‹~€HØ€LÜ›E€KØ€K×ÑÑG¹hÓˆØ
ˆØh‰hˆÜ Ó<±tÓ × Ñ  Ô × Ñ œu U¨CÓ ð  "r@z(spacy.prioritize_existing_ents_filter.v1cótSrN)r<rnr@r>Ú"make_preserve_existing_ents_filterrv¿sä *r@©r6Úexamplesc ót|«}dŠ|jd«|jdd«|jdd«|jdˆfd«|jdˆfd „«tj|fi|¤ŽS)
NrBÚattrÚ
allow_overlapTÚlabeledÚgettercóT|jj|t«dg«SrN)rJÚgetÚlen)ÚdocrWÚ attr_prefixs €r>rTz1overlapping_labeled_spans_score.<locals>.<lambda>Ís!ø€ 3§9¡9§=¡=°´S¸Ó5EÐ5GÐ1HÈ"Ô#Mr@Úhas_annotationcó |jvSrN)rJ)rr6s €r>rTz1overlapping_labeled_spans_score.<locals>.<lambda>Ïsø€°IÀÇÁÑ4Jr@)ÚdictÚ
setdefaultrÚ score_spans)rxr6Úkwargsrs ` @r>Úoverlapping_labeled_spans_scorer‰Äù€ô&\€FØ€KØ
×Ñf  
¨i¨[Ð
×Ño 
×Ñi Ô
×ÑØÓð ×ÑÐ&Ó(JÔ × Ñ ˜ 1¨&Ñ 1r@có$tt|¬«S)Nrw)rr‰rws r>Ú%make_overlapping_labeled_spans_scorerrÓsä Ô2¸ Hr@cóÚeZdZdZ d1eddej deddee e¬«dœ de
de de e d e e
eeeegeefd
ed e
eeeegeefd e eee fd
e
dedede e
ddfdZdefdZde defdZede e fd«ZdedefdZdefdZdZedee dffd«Zedee dffd«Zdddœde
geefde e
d e e e!fd!„Z"ede#e!fd"„«Z$d e#e!ddfd#„Z%d2d$„Z&de ddfd%„Z'd&e ddfd'„Z(d2d(„Z)e*«d)œd*e+d+ee ddfd,„Z,e*«d)œd+ee de+fd-„Z-e*«d)œd.ee e.fd+ee ddfd/„Z/e*«d)œd.ee e.fd+ee ddfd0„Z0y)3r=z×The SpanRuler lets you add spans to the `Doc.spans` using token-based
rules or exact phrase matches.
DOCS: https://spacy.io/api/spanruler
USAGE: https://spacy.io/usage/rule-based-matching#spanruler
NFrwr5r2r3r6r7r8r9r%r*r&r:r(rKc óÌ||_||_||_||_||_| |_|
|_||_||_| |_ ||_
i|_ |j«y)a¿Initialize the span ruler. If patterns are supplied here, they
need to be a list of dictionaries with a `"label"` and `"pattern"`
key. A pattern can either be a token pattern (list) or a phrase pattern
(string). For example: `{'label': 'ORG', 'pattern': 'Apple'}`.
nlp (Language): The shared nlp object to pass the vocab to the matchers
and process phrase patterns.
name (str): Instance name of the current pipeline component. Typically
passed in automatically from the factory when the component is
added. Used to disable the current span ruler while creating
phrase patterns with the nlp object.
spans_key (Optional[str]): The spans key to save the spans under. If
`None`, no spans are saved. Defaults to "ruler".
spans_filter (Optional[Callable[[Iterable[Span], Iterable[Span]], List[Span]]):
The optional method to filter spans before they are assigned to
doc.spans. Defaults to `None`.
annotate_ents (bool): Whether to save spans to doc.ents. Defaults to
`False`.
ents_filter (Callable[[Iterable[Span], Iterable[Span]], List[Span]]):
The method to filter spans before they are assigned to doc.ents.
Defaults to `util.filter_chain_spans`.
phrase_matcher_attr (Optional[Union[int, str]]): Token attribute to
match on, passed to the internal PhraseMatcher as `attr`. Defaults
to `None`.
matcher_fuzzy_compare (Callable): The fuzzy comparison method for the
internal Matcher. Defaults to
spacy.matcher.levenshtein.levenshtein_compare.
validate (bool): Whether patterns should be validated, passed to
Matcher and PhraseMatcher as `validate`.
overwrite (bool): Whether to remove any existing spans under this spans
key if `spans_key` is set, and/or to remove any ents under `doc.ents` if
`annotate_ents` is set. Defaults to `True`.
scorer (Optional[Callable]): The scoring method. Defaults to
spacy.pipeline.span_ruler.overlapping_labeled_spans_score.
DOCS: https://spacy.io/api/spanruler#init
N)
r2r3r6r8r%r&r:r7r9r(r*Ú_match_label_id_mapÚclear) Úselfr2r3r6r7r8r9r%r*r&r:r(s r>Ú__init__zSpanRuler.__init__àshðtˆŒØˆŒ ØŒØÔØ#6ˆÔ Ø ˆŒ
Ø"ˆŒØ(ˆÔØÔ؈Œ Ø%:ˆÔ"Ø>@ˆÔ Ø
r@có,t|j«S)z1The number of all labels added to the span ruler.)r€Ú _patterns©rs r>Ú__len__zSpanRuler.__len__(sä4—>"r@ÚlabelcóV|jj«D] }|d|k(sŒ yy)z+Whether a label is present in the patterns.rTF)Úvalues)rrÚlabel_ids r>Ú __contains__zSpanRuler.__contains__,s0à×9ˆHØ˜Ñ  EÓðr@có|jS)z2Key of the doc.spans dict to save the spans under.rwr”s r>rWz
SpanRuler.key3sð~‰~Ðr@rcóÆ|j«} |j|«}|j||«|S#t$r }||j||g|«cYd}~Sd}~wwxYw)zäFind matches in document and add them as entities.
doc (Doc): The Doc object in the pipeline.
RETURNS (Doc): The Doc with added entities, if available.
DOCS: https://spacy.io/api/spanruler#call
N)Úget_error_handlerÚmatchÚset_annotationsÚ Exceptionr3)rrÚ
error_handlerÚmatchesrls r>Ú__call__zSpanRuler.__call__8s`ð×
ð—j‘j “oˆGØ × Ñ   gÔ ˆJøÜò  §¡¨D°3°%¸Ó ;ûð <ús$7· A ÁAÁA ÁA c
ó°j«tj«5tjdd¬«t t
t tttftj««tj««z«}ddd«tˆˆfdD««}tt|««S#1swYŒ2xYw)ignorez\[W036)Úmessagec 3óK|]=\}}}||k7r2t||j|dj|d¬«Œ?y­w)rÚid)rÚspan_idN)r)r]Úm_idrQrPrrs €€r>r`z"SpanRuler.match.<locals>.<genexpr>Psdøèø€ð
#
ñ%,Ñ e˜SؘŠ|ô
ØØØØ×.¨tÑ4°WÑ×Ñ6°tÑ 
ð
ñ%,ùsƒAA)
Ú_require_patternsÚwarningsÚcatch_warningsÚfilterwarningsrr r
ÚintrdÚmatcherÚphrase_matcherrerc)rrÚdeduplicated_matchess`` r>zSpanRuler.matchHù€Ø ×ÑÔ Ü
×
× # H°iÕ Ü”Uœ3¤¤S˜=ÑT—\\ '¬$¨t×/BÑ/BÀ3Ó/GÓ*Hш
#
ñ%,ó
#
ó
Ðô”dÐ1÷#
&ús §A4C Ã Ccó|jr‡g}|j|jvr%|js|j|j}|j|jr|j ||«n|«||j|j<|j
rGg}|jst
|j«}|j||«} t|«|_yy#t$rttj«wxYw)zModify the document in placeN)
rWrJr:Úextendr7r8rdÚentsr9rcÚ
ValueErrorrÚE854)rrrJs r>zSpanRuler.set_annotations]ð 8Š8؈EØx‰x˜3Ÿ9™9Ñ$¨T¯^ª^ØŸ ™  $§(¡(Ñ+Ø L‰LØ59×5FÒ5F×! Ô1ÈGô
ð#(ˆCI‰Id—h‘hÑ à × Ò ØˆEØ—>’>ܘSŸX™XØ×$ U¨GÓ4ˆEð
! %=ð
øôò
 ¤§¡Ó
.ús ÃC(Ã(#D .có˜ttt|jDcgc]}t t
|d«Œc}«««Scc}w)zAll labels present in the match patterns.
RETURNS (set): The string labels.
DOCS: https://spacy.io/api/spanruler#labels
r)Útuplercrer“rÚstr©rÚps r>ÚlabelszSpanRuler.labelsss:ô”VœCÀÇÂÓ OÁ¸1¤¤c¨1¨W©:Õ!6ÀÑ OÓRùÒ OsžAcóÊttt|jDcgc]!}t t
|j
d««Œ#c}«tdg«z
««Scc}w)z‰All IDs present in the match patterns.
RETURNS (set): The string IDs.
DOCS: https://spacy.io/api/spanruler#ids
N)rcrer“rrs r>Úidsz
SpanRuler.ids}sOôÜ ”3¸¿ºÓ°1œœS !§%¡%¨£+ÕÑHÌ3ÐPTÈvË;Ñ 
ð
ùÚGsž&A )r2ÚpatternsÚ get_examplesrÀcóL|j«|r|j|«yy)aInitialize the pipe for training.
get_examples (Callable[[], Iterable[Example]]): Function that
returns a representative sample of gold-standard Example objects.
nlp (Language): The current nlp object the component is part of.
patterns (Optional[Iterable[PatternType]]): The list of patterns.
DOCS: https://spacy.io/api/spanruler#initialize
N)rÚ add_patterns)rr2s r>Ú
initializezSpanRuler.initialize‰s#ð
Œ Ù Ø × Ñ ˜  r@có|jS)z¿Get all patterns that were added to the span ruler.
RETURNS (list): The original patterns, one dictionary per pattern.
DOCS: https://spacy.io/api/spanruler#patterns
)r“r”s r>zSpanRuler.patternssð~‰~Ðr@c ób d}t|jj«D]\}\}}||k(sŒ|}n|jj|dDcgc]}|Œ}}|jj |¬«5g}g}|D]6} t
t| d«}
t
t| jdd««} t|
| f«} |
| dœ|j|jjjj| «<t| dt«r&|j| «|j| d«n`t| dt «r!|j"j%| | dg«n,t t&j(j+| d¬ ««|j,j| «Œ9t/||jj1|««D]"\} }
|j2j%| |
g«Œ$ ddd«ycc}w#t$rg}YŒÄwxYw#1swYyxYw)
Add patterns to the span ruler. A pattern can either be a token
pattern (list of dicts) or a phrase pattern (string). For example:
{'label': 'ORG', 'pattern': 'Apple'}
{'label': 'ORG', 'pattern': 'Apple', 'id': 'apple'}
{'label': 'GPE', 'pattern': [{'lower': 'san'}, {'lower': 'francisco'}]}
patterns (list): The patterns to add.
DOCS: https://spacy.io/api/spanruler#add_patterns
éÿÿÿÿN)ÚdisablerÚ)rÚpattern))Ú enumerater2ÚpipelineÚ
pipe_namesr¶Ú select_pipesrrÚreprrŽÚvocabÚstringsÚas_intÚ
isinstancergrdÚaddrÚE097Úformatr“ÚzipÚpiper±)rÚ
current_indexr[r3Úsubsequent_pipesÚphrase_pattern_labelsÚphrase_pattern_textsÚentryÚp_labelÚp_idrs r>zSpanRuler.add_patterns§ðˆMÜ#,¨T¯X©X×->Ñ->Ö#?<D˜˜4“<Ø$%ð$@ð26·±×1DÑ1DÀ]À^Ñ1TÓUÑ1T¨¢Ð1TÐ ÐX‰X×
"Ð+;Ð
<Ø$&Ð !Ø#%Ð Ü!Üœs E¨'¡NÓ3ÜœC §¡¨4°Ó!4Ó5ܘg t˜_Ó-àñRׯ©¯©×)?Ñ)?×)FÑ)FÀuÓ)Mјe ÔÔ°iÑ0@Õ  iÑ 0´$Ô—LL×$ U¨U°9Ñ-=Ð,>Õ$¤V§[¡[×%7Ñ%7ÀÀiÑ@PÐ%7Ó%QÓ×% #&Ø
Ð#×#×°¨yÕ#÷'
<ùò VøÜò ð "úç
<ús5/H² HÁ HÁHÁ9F H%ÈHÈ H"È!H"È%H.cóþg|_t|jj|j|j
¬«|_t|jj|j|j¬«|_ y)zfReset all patterns.
RETURNS: None
DOCS: https://spacy.io/api/spanruler#clear
)r&Ú
fuzzy_compare)rzr&N)
r“rr2r&r*rr%r”s r>rzSpanRuler.clear×s\ð -/ˆŒÜ 'Ø H‰HN‰NØ—]×!
ˆŒ ô
.;Ø H‰HN‰NØ×—]‘]ô.
ˆÕr@có"||vr5ttjjd||j¬««|j
Dcgc]
}|d|k7sŒ |Œc}|_|j D}|j |d|k(sŒ|jjjj|«}||jvr|jj|«||jvsŒ€|jj|«Œœycc}w)z«Remove a pattern by its label.
label (str): Label of the pattern to be removed.
RETURNS: None
DOCS: https://spacy.io/api/spanruler#remove
r©Ú attr_typerÚ componentN)rÚE1024rÖr3r“r2Ú as_stringr±Úremover°)rrÚm_labelÚ m_label_strs r>zSpanRuler.removeéð ˜Ñ ÜÜ ×#¨g¸UÈdÏiÉiÐð
ð&*§^¢^ÓK¡^ °q¸±zÀUÓ7Jš! ^ÑŒØ×/ˆ×ÑÑ9¸"Ÿh™hŸn™n×4×>¸G Ø $×"5Ñ"5Ñ×'×.¨{Ô $§,¡,Ò—LL× Õ
0ùòLs Á
D ÁD Ú
pattern_idcódt|«}|jDcgc]}|jd«|k7sŒ|Œc}|_|t|«k(r5ttj
j
d||j¬««|jD}|j|d|k(sŒ|jjjj|«}||jvr|jj|«||jvsŒ€|jj|«Œœycc}w)z¸Remove a pattern by its pattern ID.
pattern_id (str): ID of the pattern to be removed.
RETURNS: None
DOCS: https://spacy.io/api/spanruler#remove_by_id
ÚIDrãN)r€r“rrr3r2)rÚorig_lenr¼s r>Ú remove_by_idzSpanRuler.remove_by_idýsÿôt“9ˆØ%)§^¢^ÓQ¡^ °q·u±u¸T³{ÀjÓ7Pš! ^ÑŒØ ”s˜4“yÒ ÜÜ ×"¨*ÀÇ Á ðóð