Why not to commit .pyc files into git (and how to fix if you already did)
Python newbies often make the mistake of committing .pyc
files
in git repositories. While it isn't harmful in most cases, sometimes
it may hurt later in odd ways. But distributing these
files with code is not necessary in the first place so it's better to just keep them out
of the repository.
Python source files are compiled into bytecode and saved on disk in
form of .pyc
files which are then used by the Python virtual
machine. These files are generated automatically and on the fly on
every machine where the code runs so it's pretty much useless to share them
with collaborators.
Besides that, if a .pyc
file exists for a module, it's code can be
imported even after deleting the module source file. This may lead to
strange bugs in case a user deletes a python module and forgets to
delete the .pyc
file when checking in the changes and a collaborator
at the other end has some code that still imports from the now
non-existing module (a rare scenario but does happen!).
In order to not share your .pyc
files with others, you would add the entry "*.pyc"
in .gitigore file and
git will start ignoring any new .pyc
files in the repo. But what
about those files that are already being tracked by git? To fix this,
we need to ask git to remove these paths from it's index by running
the git rm
command with the --cached
option.
For eg.
$ git rm --cached *.pyc
Or, to untrack all .pyc
files in a project recursively,
$ find . -name '*.pyc' | xargs -n 1 git rm --cached
Aside: In case you wonder why Python source files are "compiled to
bytecode" when it's an interpreted language, I think this answer on
StackOverflow by Alex Martelli explains it excellently.