In the past weeks I’ve looked into a problem that kept us from releasing to Heroku, because the git repository was to large. In time the repository has grown to approximate 180 Mb.
In the past we’ve made some decisions that we probably wouldn’t take again. We for example, we decided to store the gem files in the vendor folder and having the rails code included in our repository. Our goal then was, that releasing the project with Capistrano would be easier and faster, because it didn’t have to download all the gem’s.
Still we have the problem that the repository contains files etc. we want to remove to shrink the size of it.
Find large files from your git history
First we have to track down what files are making our repository so large. I’ve found a perl script that scans your repository for files of a specific size. Here’s the code:
So I’ve placed this file within my repository and executed the below command to see all files, from all commits, that are bigger then 500 Kb
This give’s you an output like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
As you can see the following folders contain data that we don’t need anymore and fills our repository with unused data:
Removing the files from git
To remove the folders specified above I used the commands specified by the Remove sensitive data post from GitHub Help
The only thing I changed was adding
-rf to the
git rm command to recursively force remove the files because I am dealing with multiple files/folders within the target folder.
The final command I used was:
vendor/cache/* vendor/bundle/* vendor/rails/* log/development.log part. You can provide multiple path’s
When the command is finished, the history has been rewritten, but still the size of the repository hasn’t changed at this point.
Cleanup and reclaim space
You have to execute the following commands to also remove the files from you local repository.
1 2 3
Now we can force push our repository so that others can enjoy our effort.
The result (98,55% smaller)
After following the above steps our repository was shrinked by 98,55 %!. First the repository was 180 Mb and now it is 2.6 Mb
Ofcourse we had this numbers because there were alot of gem files within our repository that were updated frequently and pushed over and over again to the master branch.
I hope this post will help other’s to track large files within there git repository and how they can remove them to shrink the size of there repostiories.