Files
INTUIA/Programa final/spacy/training/__pycache__/batchers.cpython-312.pyc
T

110 lines
9.4 KiB
Plaintext
Raw Normal View History

2026-03-15 13:27:50 +00:00
Ë
?û gœ$ãó<ddlZddlmZddlmZmZmZmZmZm Z m
Z
m Z m Z ddl
mZmZe e
eefZe d«ZeeegeeefZej(d«ddœd ed
ed ed e eegefd
ef
d«Zej(d«ddœd eded ed e eegefd
ef
d«Zej(d« dd ed e eegefd
efd«Zddefdeed ed
ed ed ed
eeef dZddefdeed ed
eeefdZefde
eded
eeefdZy)éN)Úpartial) ÚAnyÚCallableÚIterableÚIteratorÚListÚOptionalÚSequenceÚTypeVarÚUnioné)Ú minibatchÚregistryÚItemTzspacy.batch_by_padded.v1)Ú
get_lengthÚsizeÚbufferÚdiscard_oversizerÚreturncó:|d|ini}ttf|||dœ|¤ŽS)a‰Create a batcher that uses the `batch_by_padded_size` strategy.
The padded size is defined as the maximum length of sequences within the
batch multiplied by the number of sequences in the batch.
size (int or Sequence[int]): The largest padded size to batch sequences into.
Can be a single integer, or a sequence, allowing for variable batch sizes.
buffer (int): The number of sequences to accumulate before sorting by length.
A larger buffer will result in more even sizing, but if the buffer is
very large, the iteration order will be less random, which can result
in suboptimal training.
discard_oversize (bool): Whether to discard sequences that are by themselves
longer than the largest padded batch size.
get_length (Callable or None): Function to get the length of a sequence item.
The `len` function is used by default.
r)rrr)rÚminibatch_by_padded_size)rrrrÚ optionalss úXC:\Users\garci\AppData\Roaming\Python\Python312\site-packages\spacy/training/batchers.pyÚ"configure_minibatch_by_padded_sizers=ð2/9Ð.D˜*È"€IÜ Ü ð à
ØØ ð
ñ  ðózspacy.batch_by_words.v1Ú tolerancecó:|d|ini}ttf|||dœ|¤ŽS)aCreate a batcher that uses the "minibatch by words" strategy.
size (int or Sequence[int]): The target number of words per batch.
Can be a single integer, or a sequence, allowing for variable batch sizes.
tolerance (float): What percentage of the size to allow batches to exceed.
discard_oversize (bool): Whether to discard sequences that by themselves
exceed the tolerated size.
get_length (Callable or None): Function to get the length of a sequence
item. The `len` function is used by default.
r)rrr)rÚminibatch_by_words)rrrrrs rÚconfigure_minibatch_by_wordsr9s=ð$/9Ð.D˜*È"€IÜ Üð à
ØØ ð
ñ  ðrzspacy.batch_by_sequence.v1có6|d|ini}ttfd|i|¤ŽS)zÜCreate a batcher that creates batches of the specified size.
size (int or Sequence[int]): The target number of items per batch.
Can be a single integer, or a sequence, allowing for variable batch sizes.
rr)rr)rrrs rÚconfigure_minibatchr!Us*ð/9Ð.D˜*È"€IÜ ”9Ñ 5  5¨9Ñ 5réseqsc#ódKt|t«rtj|«}n t |«}t ||¬«D]f}t
|«}t|«}t|||«D]>}|D cgc]} || Œ }
} td|
D««t|
«z} |r| |k\rŒ;|
Œ@Œhycc} w­w)aqMinibatch a sequence by the size of padded batches that would result,
with sequences binned by length within a window.
The padded size is defined as the maximum length of sequences within the
batch multiplied by the number of sequences in the batch.
size (int or Sequence[int]): The largest padded size to batch sequences into.
buffer (int): The number of sequences to accumulate before sorting by length.
A larger buffer will result in more even sizing, but if the buffer is
very large, the iteration order will be less random, which can result
in suboptimal training.
discard_oversize (bool): Whether to discard sequences that are by themselves
longer than the largest padded batch size.
get_length (Callable or None): Function to get the length of a sequence item.
The `len` function is used by default.
)rc3ó2K|]}t|«Œy­w©Úlen)Ú.0Úseqs rú <genexpr>z+minibatch_by_padded_size.<locals>.<genexpr>sèø€Ð;±(¨3œc #Ÿh±(ùóN) Ú
isinstanceÚintÚ itertoolsÚrepeatÚiterrÚlistÚnextÚ_batch_by_lengthÚmaxr() r#rrrrÚsize_Ú outer_batchÚ target_sizeÚindicesÚsubbatchÚ padded_sizes rrrbèø€ô.ÔÜ× Ñ  Ó&‰äT“
ˆÜ  ¨F× Ü˜;Ó'ˆ ܘ5“kˆ Ü °[À*ÖMˆGÙ07Ó¨1˜  A›°ˆHÐÑ;±(Ó;¼cÀ(»mÑKˆ K°;Ò$>Øàñ
Nñ4ùò9ùsA,B0Á. B+Á:6B0gš™™™™™É?c#ó”Kt|t«rtj|«}n t |«}t |«}||z}g}g} d}
d} |D} || «}
|
||zkDr |rŒ| gŒ| dk(r|
|
z|kr|j
| «|
|
z
}
Œ@|
| z|
z||zkr| j
| «| |
z
} Œe|r|t |«}||z}| }| }
g} d} |
|
z|kr|j
| «|
|
z
}
΢|
|
z||zkr| j
| «| |
z
} ŒÄ|r|t |«}||z}| g}|
}
Œà|j| «|r|yy­w)Create minibatches of roughly a given number of words. If any examples
are longer than the specified batch length, they will appear in a batch by
themselves, or be discarded if discard_oversize=True.
seqs (Iterable[Sequence]): The sequences to minibatch.
size (int or Sequence[int]): The target number of words per batch.
Can be a single integer, or a sequence, allowing for variable batch sizes.
tolerance (float): What percentage of the size to allow batches to exceed.