Last Updated: July 05, 2018
·
39.07K
· naiquevin

Why not to commit .pyc files into git (and how to fix if you already did)

Python newbies often make the mistake of committing .pyc files
in git repositories. While it isn't harmful in most cases, sometimes
it may hurt later in odd ways. But distributing these
files with code is not necessary in the first place so it's better to just keep them out
of the repository.

Python source files are compiled into bytecode and saved on disk in
form of .pyc files which are then used by the Python virtual
machine. These files are generated automatically and on the fly on
every machine where the code runs so it's pretty much useless to share them
with collaborators.

Besides that, if a .pyc file exists for a module, it's code can be
imported even after deleting the module source file. This may lead to
strange bugs in case a user deletes a python module and forgets to
delete the .pyc file when checking in the changes and a collaborator
at the other end has some code that still imports from the now
non-existing module (a rare scenario but does happen!).

In order to not share your .pyc files with others, you would add the entry "*.pyc" in .gitigore file and
git will start ignoring any new .pyc files in the repo. But what
about those files that are already being tracked by git? To fix this,
we need to ask git to remove these paths from it's index by running
the git rm command with the --cached option.

For eg.

$ git rm --cached *.pyc

Or, to untrack all .pyc files in a project recursively,

$ find . -name '*.pyc' | xargs -n 1 git rm --cached

Aside: In case you wonder why Python source files are "compiled to
bytecode" when it's an interpreted language, I think this answer on
StackOverflow by Alex Martelli explains it excellently.