Files
INTUIA/Programa final/spacy/tests/parser/__pycache__/test_ner.cpython-312.pyc
T

241 lines
41 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
?û g4uãódddlZddlZddlZddlmZddlmZmZddlm Z ddl
m Z ddl m
Z
ddlmZddlmZdd lmZdd
lmZdd lmZdd lmZmZdd
lmZmZmZddlm Z ddlm!Z!dddgifddddgifgZ"ejFd«Z$ejFd«Z%ejFd«Z&ejFd«Z'ejFd«Z(ejFd«Z)ejTjWddg«ejTjYd«d „««Z-ejTjYd!«d"„«Z.ejTjYd#«d$„«Z/ejTjYd%«d&„«Z0ejTjYd'«d(„«Z1d)„Z2ejTjYd*«d+„«Z3ejTjYd,«d-„«Z4d.„Z5d/„Z6d0„Z7d1„Z8d2„Z9ejTjud3¬4«d5„«Z;ejTjud3¬4«d6„«Z<d7„Z=d8„Z>d9„Z?d:„Z@d;„ZAd<„ZBd=„ZCd>„ZDd?„ZEejTjWd@dAdBg«dC„«ZFdD„ZGdE„ZHdF„ZIdG„ZJdH„ZKdI„ZLedJ«GdK„dL««ZNy)MéN)Ú assert_equal)ÚregistryÚutil)ÚENT_IOB)ÚEnglish)ÚItalian)ÚLanguage)ÚLookups)ÚEntityRecognizer)Ú
BiluoPushDown)ÚDEFAULT_NER_MODEL)ÚDocÚSpan)ÚExampleÚ iob_to_biluoÚsplit_bilu_label©ÚVocabé)Ú make_tempdirúWho is Shaka Khan?Úentities©ééÚPERSONzI like London and Berlin.)ré
ÚLOC)éércóy) non_entities©r#óú\C:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/tests/parser/test_ner.pyÚneg_keyr&sà r$cót«S©Nrr#r$r%Úvocabr)!s ä 7€Nr$có t|gd¢¬«S)N)ÚCaseyÚwentÚtoÚNewÚYorkúÚwords)r)r)s r%Údocr3&sä ˆ Hr$cóz|dd}|dd}|j|jdf|j|jdfgS)NrééérÚGPE)Ú
start_charÚend_char)r3ÚcaseyÚnys r%Ú
entity_annotsr=+sHà !ˆH€EØ ˆQˆqˆ€Bà × Ñ ˜5Ÿ>™>¨8Ð ˜Ÿ  UÐ ðr$c ódtt|Dcgc]\}}}|Œ
c}}}««Scc}}}wr()ÚsortedÚset)r=Úlabels r%Ú entity_typesrD5s*ä ”#±-Õ@±-¡ ! Q¨’u°-Ó BùÔ@s
+cóZtj|¬«}t|j|«S)rD)r Ú get_actionsÚstrings)r)rDÚactionss r%ÚtsysrJ:s#ä×'°\ÔB€GÜ ˜Ÿ¨Ó 0r$rCz
U-JOB-NAMEi¯c
óüt«}i}|jd|¬«}tjt |j
dg¬«dgdgdgdgdg|gdœ«}d |j j|g¬
«d vsJy) ner©ÚconfigÚwordr1rÚtagÚdep)Úidsr2ÚtagsÚheadsÚdepsrzJOB-NAME)Úexamplesr5)r Ú create_piperÚ from_dictrr)ÚmovesrG)rCÚnlprNrLÚexamples r%Útest_issue1967r\@ô ‹*€CØ
€FØ
/‰/˜ˆ
/€CÜ×ÑÜ ˆCI‰I˜f˜XÔ˜ñ

ó
€G𠘟׸ ÐBÀ1Ñ  Er$cóht«}|jd«}|jd«|j«t«}|jd«t |j d«j «dk(sJ|j d«j}|jd||j d«jj«|j|j««d|j d«jvsJ|j d«j dk(sJy)zGTest that spurious 'extra_labels' aren't created when initializing NER.rLÚ CITIZENSHIPrÚ
resize_outputÚ extra_labels)r^N)rÚadd_pipeÚ add_labelÚ
initializeÚlenÚget_pipeÚlabelsÚmodelÚattrsrYÚn_movesÚ
from_bytesÚto_bytesÚcfg)rZrLÚnlp2rgs r%Útest_issue2179rnTô ‹)€CØ
,‰,
€C؇MM ؇NÜ ‹9€D؇MMÜ ˆt}‰}˜UÓ#× +¨qÒ  M‰M˜ × &€EØ €E‡KKÑ  ¨¯ © °UÓ(;×(AÑ(A×(IÑ(IÔ‡OOC—L‘L“NÔ  §¡¨uÓ!5×!9Ñ!9Ñ  =‰=˜Ó × &Ð*:Ò  :r$iQ có¤d}t|«gd¢k(sJd}t|«gd¢k(sJd}t|«gd¢k(sJd}t|«gd¢k(sJy )
z9Test that IOB tags are correctly converted to BILUO tags.)ú B-BRAWLERú I-BRAWLERrq)rprqz L-BRAWLER)úI-ORGrrúB-ORG)rsúL-ORGzU-ORG)úB-PERSONzI-PERSONru)ruúL-PERSONúU-PERSON)úB-MULTI-PERSONzI-MULTI-PERSONrx)rxzL-MULTI-PERSONzU-MULTI-PERSONN)r)Útags1Útags2Útags3Útags4s r%Útest_issue2385r}eshð
4€EÜ ˜Ó Ò"IÒ  '€EÜ ˜Ó Ò"=Ò  0€EÜ ˜Ó Ò"FÒ  B€EÜ ˜Ó Ò"XÒ  Xr$
cóât«}g}|jtj|j d«dgi«g«t d«Dcgc]
}t
|«Œ}}|jd«}t|«D]}|j|«Œ|j«}t d«D]6}i}tj|«|D]}|j|g||d¬«ŒŒ8ycc}w) zdTest issue that arises when too many labels are added to NER model.
Used to cause segfault.
z One sentencerrLégà?)ÚsgdÚlossesÚdropN)rÚextendrrXÚmake_docÚrangeÚstrraÚlistrbrcÚrandomÚshuffleÚupdate) rZÚ
train_dataÚirDrLÚ entity_typeÚ optimizerrr[s r%Útest_issue2800rvô
‹)€CØ€JØ×ÑÜ × Ñ ˜3Ÿ<™<¨Ó7¸*ÀbÐ9IÓ JÐô%*¨$¤KÓ0¡K˜q”C˜•F K€LÐ
,‰,
€Cܘ)ˆ Ø
Ó €IÜ
2ŽYˆØˆÜ!ˆ J‰J˜y i¸ÀSˆ ùò 1sÁC,i‰ có¨t«}|jd«}|jd«|j«gd¢}|j|k(sJt«}|jd«}|j
}|j d||jj«|j|j««|j|k(sJy)z»Test issue that occurred in spaCy nightly where NER labels were being
mapped to classes incorrectly after loading the model, when the labels
were added using ner.add_label().
rLÚANIMAL)ÚOzB-ANIMALzI-ANIMALzL-ANIMALzU-ANIMALr_N) rrarbrcÚ
move_namesrgrhrYrirjrk)rZrLr“rmÚner2rgs r%Útest_issue3209r•Œô ‹)€CØ
,‰,
€C؇MM؇NÚF€JØ >‰>˜  9€DØ =‰=˜Ó €DØ J‰J€EØ €E‡KKÑ  ¨¯ © ×(9Ñ(9Ô‡OOC—LL“NÔ ?‰?˜  (r$cóàt«}|jd«}|jd«|j«gd¢}dh}|j|k(sJt |j «|k(sJy)zBTest that labels are inferred correctly when there's a - in label.rLz LARGE-ANIMAL)rzB-LARGE-ANIMALzI-LARGE-ANIMALzL-LARGE-ANIMALzU-LARGE-ANIMALN)rrarbrcr“r@rf)rZrLr“rfs r%Útest_labels_from_BILUOr— sfä
)€CØ
,‰,
€C؇MM‡Nò€JðÐ
€FØ >‰>˜  ˆsz‰z?˜  $r$cóøt«}|jd«}|jd«|j«d|jvsJ|d«}|j d«sJ|D]}|j dk(rŒJdddœg}|jd «}|j|«d |jvsJd|jvsJ|d«}|j d«sJ|D]}|j dk(rŒJy
) zDTest that running an entity_ruler after ner gives consistent resultsrLÚPEOPLEÚhirrÚSOFTWAREÚspacy©rCÚpatternÚ entity_rulerN)rrarbrcÚ
pipe_namesÚhas_annotationÚent_iobÚ add_patterns)rZrLÚdoc1ÚtokenÚpatternsÚrulerÚdoc2s r%Útest_issue4267r©²sô ‹)€CØ
,‰,
€C؇MM؇NØ C—N  ˆt9€DØ × Ñ ˜yÔ  ˆØ}‰} ÓðÑ:€HØ L‰L˜Ó (€EØ ×Ñ Ø ˜SŸ^™^Ñ  C—N  ˆt9€DØ × Ñ ˜yÔ  ˆØ}‰} Ór$cóÔd}d}t«}||dœ}|jd|¬«}|jd«|j«|d«}t |j
«dk(sJd|j
vsJt
|d d
d ¬ «}t|j«|gz|_|g}|j|d
||¬«t |j
«dk(sJd |j
vsJy)z:This should not crash or exit with some strange error codeéç-Cëâ6Ú
beam_widthÚ beam_densityÚbeam_nerrMÚ
SOME_LABELzWhat do you think about Apple ?r5r7éÚMY_ORG©rCç)rrN)
rrarbrcrdrfrr‡ÚentsÚ
beam_parse)rZrNrLr3Ú apple_entÚdocss r%Útest_issue4313rºÌð€JØ€LÜ
‹)€Cà Ø€Fð ,‰,z¨&ˆ
1€C؇MM؇Ná
Ð
0€CÜ ˆsz‰z‹?˜aÒ ÐÐ Ø ˜3Ÿ:™:Ñ  S˜!˜Q /€IÜC—HH~   Ñ+€C„Hð
ˆ5€D؇NN4˜c¨jÀ|€NÔ ˆsz‰z‹?˜aÒ ÐÐ Ø s—z‘zÑ  !r$có²tj|d|i«}|j|d¬«}|Dcgc]}|j|«Œ}}|gd¢k(sJycc}w)NrF)Ú_debug)rwrrúB-GPEúL-GPEr)rrXÚget_oracle_sequenceÚget_class_name)rJr3r=r[Ú act_classesÚactÚnamess r%Útest_get_oracle_movesrÄçsaÜ×Ñ  j°-Ð%@ÓA€GØ×*¨7¸A€KÙ1<Ó ¨#ˆT×
Ñ
 Õ
€EÐ Ò  Aùò
>s°Acó¶||jd<t|ddg¬«}ddg}tj|d|i«}t |j
ddd ¬
«t |j
dd d ¬
«g|j
j |<|j|«}|Dcgc]}|j|«Œ}}|sJ|dd k7sJ|dd
k7sJ|ddk7sJycc}w)zTest that we don't get stuck in a two word input when we have a negative
span. This could happen if we don't have the right check on the B action.
r&ÚBr1Nrrr5rr´rrrurv© rlrrrXrÚspansr¿© rJr)r&r3r=r[s r%Ú$test_negative_samples_two_word_inputrÌîð"€D‡HHˆÜ
ˆe˜C ˜
&€Cؘ4L€MÜ×Ñ  j°-Ð%@ÓA€Gô
ˆWY‰Y˜˜1  ˆWY‰Y˜˜1  €G‡II‡OOð×*¨7Ó3€KÙ1<Ó ¨#ˆT×
Ñ
 Õ
€EÐ €Lˆ5Ø ‰8sŠ?Ј?Ø ‰8  ‰8  !ùò
>sÂCcó¢||jd<t|gd¢¬«}gd¢}tj|d|i«}t |j
ddd¬ «t |j
dd
d ¬ «g|j
j |<|j|«}|Dcgc]}|j|«Œ}}|sJ|ddk7sJ|dd k7sJy
cc}w)úHTest that we exclude a 2-word entity correctly using a negative example.r&)ÚCr1)NNNrrr5rr´rrruNrÈs r%Ú&test_negative_samples_three_word_inputrÐà!€D‡HHˆYÑÜ
ˆeš?Ô
+€CÚ&€MÜ×Ñ  j°-Ð%@ÓA€Gô
ˆWY‰Y˜˜1  ˆWY‰Y˜˜1  €G‡II‡OOð×*¨7Ó3€KÙ1<Ó ¨#ˆT×
Ñ
 Õ
€EÐ €Lˆ5Ø ‰8sŠ?Ј?Ø ‰8  !ùò
>sÂC cóž||jd<t|dg¬«}dg}tj|d|i«}t |j
ddd¬ «t |j
ddd
¬ «g|j
j |<|j|«}|Dcgc]}|j|«Œ}}|sJ|ddk7sJ|dd k7sJycc}w) r&r1Nrrr5rr´rrws r%Útest_negative_samples_U_entityrÒà!€D‡HHˆYÑÜ
ˆe˜C˜
!€CØF€MÜ×Ñ  j°-Ð%@ÓA€Gô
ˆWY‰Y˜˜1  ˆWY‰Y˜˜1  €G‡II‡OOð×*¨7Ó3€KÙ1<Ó ¨#ˆT×
Ñ
 Õ
€EÐ €Lˆ5Ø ‰8sŠ?Ј?Ø ‰8  !ùò
>sÂC
cóˆtj|¬«}t|j|d¬«}|jddk(sJy)NrFr")Úincorrect_spans_keyr&)r rGrHrl)r)rDrIrJs r%Ú%test_negative_sample_key_is_in_configrÕ*s;Ü×'°\ÔB€GÜ ˜Ÿ¨À^Ô T€DØ 8‰8    0r$zNo longer supported)Úreasoncó6gd¢}gd¢}t||¬«}tj|||dœ«}t|j«}d}|D}|Œ|dk(r"|j |j
d«d«Œ-t|«\}} |j |j
d«| «|j |j
d «| «|j |j
d
«| «|j |j
d «| «ŒÀ|j|«y) N)Ú52ÚBomber)NNz L-PRODUCTr1)r2r©ÚMrÇÚUr’r’Ú© rrrXr rHÚ
add_actionÚindexrr¿©
Úen_vocabr2Ú
biluo_tagsr3r[rYÚ
move_typesrPÚactionrCs
r%Útest_oracle_moves_missing_Brè2sâ !€EÚ*€Jä
ˆh˜
$€CÜ×Ñ ¨uÀ*Ñ%MÓN€Gä ˜(× +€EØ/€JÛˆØ ˆ;Ø Ø
CŠZØ × Ñ ˜Z×-¨cÓ2°BÕ ,¨SÓ1‰MˆF × Ñ ˜Z×-¨cÓ2°EÔ × Ñ ˜Z×-¨cÓ2°EÔ × Ñ ˜Z×-¨cÓ2°EÔ × Ñ ˜Z×-¨cÓ2°EÕ ð
×ј&r$cóngd¢}gd¢}t||¬«}tj|d|i«}t|j«}d}|D][}|Œ|dk(r"|j |j
d«d«Œ-t|«\}} |j |j
|«| «Œ]|j|«y)N) Ú
productionú
ÚofÚNorthroprëzCorp.rëz'sÚradar) rrrrsNrrrtrrr1rrs
r%Útest_oracle_moves_whitespacerïLâ V€EÚK€Jä
ˆh˜
$€CÜ×Ñ  j°*Ð%=Ó>€Gä ˜(× +€EØ/€JÛˆØ ˆ;Ø Ø
CŠZØ × Ñ ˜Z×-¨cÓ2°BÕ ,¨SÓ1‰MˆF × Ñ ˜Z×-¨fÓ5°uÕ ð
×ј&r$cót«}|d«}i}|jd|¬«}|Dcgc]}|jŒc}gd¢k(sJ|Dcgc]}|jŒc}gd¢k(sJ|jj dd«|j
d«|jj|g«d}|jj|d «|jj|d «|jj|d «|jj|d
«sJt«}|d«}i}|jd|¬«}|jg|d dgd ¬
«|Dcgc]}|jŒc}gd¢k(sJ|Dcgc]}|jŒc}gd¢k(sJ|jj dd«|jj dd«|j
d«|jj|g«d} |jj| d «|jj| d «|jj| d «|jj| d
«rJ|jj| d«sJ|jj| d«|jj| d
«rJ|jj| d«sJycc}wcc}wcc}wcc}w)z5Test succesful blocking of tokens to be in an entity.úI live in New YorkrLrM©r7r8rrr6Ú
unmodified©ÚblockedÚdefault)ézU-N) rrWÚent_iob_Ú ent_type_rYrbÚ
init_batchÚapply_transitionÚis_validÚset_ents)
Únlp1r¤rNÚner1r¥Ústate1rmr”Ústate2s
r%Útest_accept_blocked_tokenraô ‹9€DÙ Ð %€DØ
€FØ × Ñ ˜E¨&Ð Ó 1€DÙ(,Ó ˜uˆEN‹N¨Ñ -Ò1EÒ  EÙ)-Ó  ˆEO‹O¨Ñ .Ò2FÒ   ‡Jј!˜RÔ Ø‡NNà
Z‰Z×
" D 
*¨1Ñ
-€F؇JÑ ¨Ô‡JÑ ¨Ô‡JÑ ¨Ô :‰:× Ñ ˜v wÔ   ‹9€DÙ Ð %€DØ
€FØ × Ñ ˜E¨&Ð Ó 1€Dð ‡MM"˜t A a˜y˜k°<€MÔ@Ù(,Ó ˜uˆENÑ -Ò1GÒ  GÙ)-Ó  ˆEO‹O¨Ñ .Ò2FÒ   ‡Jј!˜RÔ Ø‡Jј!˜RÔ Ø‡NNØ
Z‰Z×
" D 
*¨1Ñ
-€F؇JÑ ¨Ô‡JÑ ¨Ô‡JÑ ¨Ôz‰z×" 6¨7Ô :‰:× Ñ ˜v tÔ  ‡JÑ ¨Ôz‰z×" 6¨7Ô :‰:× Ñ ˜v tÔ  ,ùòO .ùÚ .ùò* .ùÚ .s¬K:Á
K?Å+LÆ L c ó¦dddgifddgifg}t«}g}|D]<}|jtj|j |d«|d««Œ>|j dd¬ «}|j
d
«|j«td «D]5}i}tj|d ¬
«}|D]}|j||¬«ŒŒ7y)z7Test that training an empty text does not throw errors.rrrrr5rLÚlastrré©Úsize©rN) rÚappendrrXr„rarbrcr…rÚ minibatchrŠ© rrZÚtrain_examplesÚtrLÚitnrÚbatchesÚbatchs r%Útest_train_emptyrð
 
Ð->Ð,?Ð ˆj˜
Ðð€Jô
)€CØ€NÛ
ˆØ×Ñœg×· ± ¸Q¸q¹TÓ0BÀAÀaÁDÓà
,‰,u 4ˆ
(€C؇MM؇NÜQŽxˆØˆÜ—.‘. °aÔÛˆEØ J‰Ju Vˆ ñr$c óødddgifg}t«}g}|D]<}|jtj|j |d«|d««Œ>|j dd¬«}|j
d «|j«td
«D]W}i}tj|d ¬ «}|D]7}tjt«5|j||¬
«ddd«Œ9ŒYy#1swYŒFxYw)zFTest that the deprecated negative entity format raises a custom error.rr)rrz!PERSONrr5rLTrrrrrr N)rr
rrXr„rarbrcr…rr ÚpytestÚraisesÚ
ValueErrorrŠr s r%Útest_train_negative_deprecatedr§ð
 
Ð-?Ð,@ЀJô ‹)€CØ€NÛ
ˆØ×Ñœg×· ± ¸Q¸q¹TÓ0BÀAÀaÁDÓà
,‰,u 4ˆ
(€C؇MM؇NÜQŽxˆØˆÜ—.‘. °aÔÛˆEÜœzÕ
˜
Ôñ÷*ús ÃC0Ã0C9 cóìt«}|jd«|j«|d«}|Dcgc]}|jŒc}gd¢k(sJ|Dcgc]}|jŒc}gd¢k(sJi}|j d|¬«}|j jdd«|jd«|j j|g«d }|j j|d
«sJ|j j|d «sJ|j j|d
«|j j|d «sJ|j j|d
«sJycc}wcc}w)NrL)rrrrrrMr7r8rzU-GPEzI-GPEr¾) rrarcrWrYrb)rZr3rNr”Ústates r%Útest_overwrite_tokenr¼sDÜ
‹)€C؇LLÔØ‡Ná
Ð
#€CÙ(+Ó ˜uˆENÑ ,Ò0IÒ  IÙ),Ó  ˆEO‹O¨Ñ -Ò1EÒ 
€FØ ?‰?˜5¨ˆ?Ó 0€D؇Jј!˜RÔ Ø‡NNØ J‰J× ! 3 %Ó Ñ +€EØ :‰:× Ñ ˜u gÔ  :‰:× Ñ ˜u gÔ  ‡JÑ  wÔ :‰:× Ñ ˜u gÔ  :‰:× Ñ ˜u gÔ  .ùò -ùÚ -s ¸E,ÁE1cóÜt«}|jd«}|jd«|j«|d«}gd¢}|Dcgc]}|jŒc}|k(sJycc}w)NrLÚMY_LABELz3John is watching the news about Croatia's elections) rrrrr’r’rrr)rrarbrc)rZrLr3Úresultr¥s r%Útest_empty_nerrÑsaÜ
)€CØ
,‰,
€C؇MM؇NÙ
Ð
D€Câ
:€FÙ(+Ó ˜uˆEN‹N¨Ñ Ò  6ùÒ ,sÁ
A)có|t«}dddœg}|jd«}|jd«}|jd«|j«|j |«|d«}gd¢}gd ¢}|Dcgc]}|j
Œc}|k(sJ|Dcgc]}|j Œc}|k(sJy
cc}wcc}w) zLTest that an NER works after an entity_ruler: the second can add annotationsÚTHINGÚThisrrLrú*This is Antti Korhonen speaking in Finland©rrr’r’r’r’©r rrarbrc)rZÚ
untrained_nerr3Ú
expected_iobsÚexpected_typesr¥s r%Útest_ruler_before_nerr)Üä
)€Cð"¨fÑ6€HØ L‰L˜Ó (€Eð—L‘L Ó'€MØ×ј‡NØ ×Ñ Ù
Ð
;€CÚ7€MÚ6€NÙ(+Ó ˜uˆENÑ
Ò  =Ù),Ó  ˆEO‹O¨Ñ Ò  ?ùò -ùÚ -s Á9B4ÂB9có‚ddi}dti}tj|d¬«d}t||fi|¤Žt||«y)update_with_oracle_cut_sizeédrgT)Úvalidate)r
rÚresolver )rNrlrgs r%Útest_ner_constructorr/ðsKà% €Fð Ô
&€CÜ × Ñ ˜S¨4Ô Ñ 9€EÜX˜uÑÒX˜%r$có€t«}|jdd¬«}|jd«|j«dddœg}|jd«}|j |«|d «}gd
¢}gd ¢}|Dcgc]}|j
Œc}|k(sJ|Dcgc]}|j Œc}|k(sJy cc}wcc}w)
zTTest that an entity_ruler works after an NER: the second can overwrite O annotationsrLÚuner)Únamerr r!rr"r#r$Nr%)rZr&r3r'r(s r%Útest_ner_before_rulerr3ús¿ä
‹)€Cð—L‘L ¨V4€MØ×јJÔ‡Nð"¨fÑ6€HØ L‰L˜Ó (€EØ ×Ñ á
Ð
;€CÚ7€MÚ6€NÙ(+Ó ˜uˆENÑ
Ò  =Ù),Ó  ˆEO‹O¨Ñ Ò  ?ùò -ùÚ -s Á;B6ÂB;cóXt«}|jddddœ¬«|jd«}|jd«|j«|d«}gd ¢}gd
¢}|Dcgc]}|jŒc}|k(sJ|Dcgc]}|j
Œc}|k(sJy cc}wcc}w) zITest functionality for blocking tokens so they can't be in a named entityÚblockerrr7)ÚstartÚendrMrLrz,This is Antti L Korhonen speaking in Finland)rrr’r’r’)N)rrarbrc)rZr&r3r'r(s r%Útest_block_nerr8ô ‹)€C؇LL¨Q°qÑ#9€LÔ—LL Ó'€MØ×ј‡NÙ
Ð
=€CÚ<€MÚ5€NÙ(+Ó ˜uˆENÑ