Python Marshal

Pratikbais
2 min readAug 30, 2022

Python value serializationutilize is offered through the python marshal module. In other words, the module includes methods for binary-format writing and reading of Python objects. The format is unfortunately not defined, and Python maintainers may alter it in ways that are incompatible with previous Python versions. Other Python mutilise the marshal module internally, for instance, to read and write.py files that contain pseudo-compiled Python code. But you may also access this serialization technique using Python’s open API.

The marshal module shouldn’t be used with untrusted data, as demonstrated in this post, which also demonstrates how the module may be swiftly evaluated with a basic dumb fuzzer.

Because the marshal module is written in C, the easiest fuzzing objective is to simply search for common C programming errors like buffer overflows, use-after-free, null-pointer dereferences, etc. The excellent memory checker AddressSanitizer (ASan) might assist in locating such problems. Code is instrumented by AddressSanitizer during compilation. The utility provides a check for memory corruption problems and replaces the malloc and free routines. The program then tries to find memory corruptions during runtime and notify them right away with lots of helpful details. Python may be built with GCC 4.8+, which includes AddressSanitizer.

Building Python with AddressSanitizer

Python code (CPython) can be cloned with the following command:

hg clone https://hg.python.org/cpython

If you run ./configure --help, you can see that it has --with-address-sanitizer option which is supposed to enable AddressSanitizer. But for some reason it didn’t work for me, so I just used the following commands to build Python:

CFLAGS="-g -fsanitize=address -fno-omit-frame-pointer -O0" \
CPPFLAGS="-fsanitize=address -fno-omit-frame-pointer -O0" \
LDFLAGS="-fsanitize=address" \
./configure \
--prefix=/home/artem/projects/fuzzing/python/build/ \
--disable-ipv6
ASAN_OPTIONS="detect_leaks=0" make
ASAN_OPTIONS="detect_leaks=0" make install

Let me quickly explain what those options mean:

  1. CFLAGS, LDFLAGS, CPPFLAGS are standard enviroment variable which specify options for C/C++ compiler and linker.
  2. -fsanitize=address enables AddressSanitizer (it has to be passed to both compiler and linker)
  3. -g makes GCC produce debugging information.
  4. -O0 turns off compiler optimizations (but slows down execution).
  5. -fno-omit-frame-pointer is for nicer stack traces.
  6. ASAN_OPTIONS is an environment variable which contains parameters for AddressSanitizer at runtime.
  7. ASAN_OPTIONS="detect_leaks=0" turns off memory leaks checker which is part of AddressSanitizer.
  8. --prefix specifies a directory where it should put output binaries, libs, etc.
  9. --disable-ipv6 disables IPv6 (nothing surprising).

If the build runs smoothly, you can run python3.6 --version as a smoke test.

--

--