Joined May 2013
·

Michael Stucki

Switzerland
·
·
·

Hi Michiel!

I'm not sure what didn't work for you. It certainly takes a while to process, however it is tested and works fine.

I would be careful with the queries you mentioned as I see quite a few issues:

  1. You don't clean up in sys_file.
  2. Your query doesn't check for the fieldname, which should be unique as well. I made a short check on one of my test sites where some of the duplicates refer to the field "media" and some to the "image" field. This can happen if a content element was of type "uploads" before but is now an image element. So unless you keep both entries in the reference table, you also need to check for the CType (uploads uses "media", image and textpic use "image").
  3. Don't forget to check the "deleted" field.
  4. "... WHERE n1.uid > n2.uid AND n1.uid <> n2.uid" can be shortened.

However, I see that the original queries that I suggested could be well optimized. Instead of creating temporary tables, a JOIN should work out well and be much faster. In any case, you need to make a connection between sysfile and sysfile_reference if you're cleaning up in both tables.

Another update: Don't read files from a subshell. Instead, use "find ... -exec" (solves issues with files that contain whitespaces).

Old version:

git filter-branch --tree-filter 'mkdir html && mv $(find . -mindepth 1 -maxdepth 1 | grep -v "^./.git" | grep -v "^./html$") html/' HEAD

Small update: I removed the end-of-line marker on of the greps, to make sure that ".git" as well as ".gitignore" are not moved.

Achievements
70 Karma
9,440 Total ProTip Views
Interests & Skills