(trl) [alex@compute-od-gpu-st-p4d-24xlarge-205 trl]$ accelerate launch --config_file configs/fsdp_config_local.yaml test_trl_accelerate.py Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Parameter 'function'=<function <lambda> at 0x7f589f413d30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Parameter 'function'=<function <lambda> at 0x7f6e27aa0d30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Parameter 'function'=<function <lambda> at 0x7f1c6fc74d30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Parameter 'function'=<function <lambda> at 0x7f3decf69d30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Parameter 'function'=<function <lambda> at 0x7fd92c634d30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Parameter 'function'=<function <lambda> at 0x7fb7d634ed30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 0%| | 0/25 [00:00<?, ?ba/s]Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) Parameter 'function'=<function <lambda> at 0x7f70ae93bd30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 247.47ba/s]

Reusing dataset imdb (/home/alex/.cache/huggingface/datasets/imdb/plain_text/1.0.0/2fdd8b9bcadd6e7055e742a706876ba43f19faee861df134affd7a3f60fc38a1) 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 248.12ba/s] Parameter 'function'=<function <lambda> at 0x7f3d2ca4ad30> of the transform [email protected] couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 243.81ba/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 250.15ba/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 241.42ba/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 244.92ba/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 241.98ba/s] /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( /home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/pipelines/text_classification.py:89: UserWarning: return_all_scores is now deprecated, use top_k=1 if you want similar functionnality warnings.warn( DEVICE: cuda:5 DEVICE: cuda:1 DEVICE: cuda:4 DEVICE: cuda:7 DEVICE: cuda:6 DEVICE: cuda:2 DEVICE: cuda:3 DEVICE: cuda:0 huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context response = gpt2_model.generate(query_tensors[i].unsqueeze(dim=0),response = gpt2_model.generate(query_tensors[i].unsqueeze(dim=0),

response = gpt2_model.generate(query_tensors[i].unsqueeze(dim=0),response = gpt2_model.generate(query_tensors[i].unsqueeze(dim=0), File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

response = gpt2_model.generate(query_tensors[i].unsqueeze(dim=0), File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context

0it [00:05, ?it/s] File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) return func(*args, **kwargs) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate

    return func(*args, **kwargs)return func(*args, **kwargs)  File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate

return func(*args, **kwargs)

return func(*args, **kwargs)return func(*args, **kwargs) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1320, in generate return self.sample( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample return self.sample( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample return self.sample(return self.sample(

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample return self.sample( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample return self.sample(return self.sample(

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/generation_utils.py", line 1938, in sample outputs = self( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl outputs = self( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl outputs = self(outputs = self(

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl outputs = self( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl outputs = self( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl outputs = self( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/alex/trl/trl/gpt2.py", line 109, in forward return forward_call(*input, **kwargs) File "/home/alex/trl/trl/gpt2.py", line 109, in forward return forward_call(*input, **kwargs)return forward_call(*input, **kwargs)

File "/home/alex/trl/trl/gpt2.py", line 109, in forward File "/home/alex/trl/trl/gpt2.py", line 109, in forward return forward_call(*input, **kwargs) return forward_call(*input, **kwargs) File "/home/alex/trl/trl/gpt2.py", line 109, in forward File "/home/alex/trl/trl/gpt2.py", line 109, in forward transformer_outputs = self.transformer( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl transformer_outputs = self.transformer( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl transformer_outputs = self.transformer(transformer_outputs = self.transformer(

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) return forward_call(*input, **kwargs) File "/home/alex/trl/trl/gpt2.py", line 109, in forward File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward return forward_call(*input, **kwargs) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward return forward_call(*input, **kwargs)return forward_call(*input, **kwargs)

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward transformer_outputs = self.transformer( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl transformer_outputs = self.transformer( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl hidden_states = self.ln_f(hidden_states)hidden_states = self.ln_f(hidden_states)

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl transformer_outputs = self.transformer( File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) hidden_states = self.ln_f(hidden_states) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl hidden_states = self.ln_f(hidden_states) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward return forward_call(*input, **kwargs)return forward_call(*input, **kwargs)

File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward File "/home/alex/.envs/trl/lib64/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward return forward_call(*input, **kwargs) File "/home/alex/.envs/trl/lib64/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 917, in forward hidden_states = self.ln_f(hidden_states)return forward_call(*input, **kwargs)