Extended pickling support for Python objects
Top 24.5% on sourcepulse
cloudpickle
extends Python's built-in pickle
module to serialize a wider range of Python objects, particularly useful for distributed computing and interactive environments like Jupyter notebooks. It enables the serialization of lambda functions and objects defined in __main__
, addressing limitations of standard pickling for dynamic code execution.
How It Works
cloudpickle
primarily achieves its extended serialization capabilities by implementing "serialization by value" for functions and classes. Unlike standard pickle
's "serialization by reference" (which relies on module imports in the unpickling environment), cloudpickle
can embed the actual code of functions and classes. This is particularly advantageous for cluster computing where remote workers might not have access to the same modules or environments as the client. Explicit registration (register_pickle_by_value
) allows users to opt-in to this behavior for specific modules, simplifying deployment in distributed systems.
Quick Start & Requirements
pip install cloudpickle
Highlighted Details
pickle
's "serialization by reference."register_pickle_by_value
) to control serialization behavior for modules.Maintenance & Community
tox
for multiple Python versions.Licensing & Compatibility
Limitations & Caveats
Serialization by value is experimental and may fail if pickled functions contain import statements or if functions pickled by reference call functions pickled by value. cloudpickle
is not intended for long-term object storage. Loading data from untrusted sources is a security risk due to potential arbitrary code execution.
3 weeks ago
1+ week