Quantcast
Channel: stardot.org.uk
Viewing all articles
Browse latest Browse all 2379

off-topic • Re: Grouping and Counting & Sizing Files In Linux

$
0
0
Back on this as a new problem has emerged. Last month's picture sets, unusually, contain an apostrophe, and it can't cope with sizing the files. Counting is OK though:

Code:

BM Picture Set Counter Version 1.4 1/10/2023Counting sets in /media/BMNFS/BeebMaster/Website/Images/2024/May 2024May 2024,A4HTMLGEN,,8,90133775May 2024,BeebRoomMay2024,,2,16488161May 2024,BeebStockRoomMay2024,,11,98369849May 2024,BeebVaultMay2024,,6,50082286May 2024,DFS090NetBugs,,29,3055441May 2024,DFS245OSWORDBugs,,16,1451086May 2024,DFSForm4.6,,22,3297712May 2024,DFSZeroLenBug,,55,6874571May 2024,DiscSplit,,41,8087086May 2024,DNFSANFS,,22,2508832May 2024,EconetParty2024,,49,33768334May 2024,FindFS,,26,3718065May 2024,FindROM,,7,1141080May 2024,IsNet,,12,2362995May 2024,OSBYTE170,,15,2950550May 2024,PiEconetBridge2.1pre,,8,1536405May 2024,ReadRTCTests,,4,707870May 2024,RGBtoHDMIRetune,,4,1650474May 2024,ROMInfo,,19,3439143May 2024,ROMStatus,,19,2787191May 2024,SRAMInit,,17,3168244May 2024,Wizard'sRevenge,,32,
More generally I will have to have a think about whether it's desirable to have an apostrophe in filenames at all, as eventually they will become part of a URL. It also copes all right with filenames with a full stop in them which isn't the delimiter for the suffix. Again that's unusual as I don't usually include a full stop in the filenames when referencing a version number, so that's perhaps another one to think about.

In the meantime, I would like this script to be able to cope with filenames with special characters in them, I am sure it must be possible.

Current script below; I've tried multiple different things on the line near the end where it does the xargs stat and awk, involving use of slashes, single and double quotes but none has solved the problem:

Code:

#!/bin/bash# Make CSV style list of picture sets & counts & sizes in current directory# Version 1.4  1/x/2023# Do not count mp4 files as separate sets, include in main set count (n.b. this increases the set number count, as video files have the same index number as the corresponding picture)# N.B. sets which have a name which is an exact subset of another set name (eg. "Station112Fix", "Station112FixPBC") will have an incorrect size for the first matchexec 2>>/tmp/debug.outputset -x# Make list of files only forming part of picture sets# ie. exclude all non-pic files, and caption & info picture files etc # Loop through 3 relevant filetypes and exclude matches with unwanted filename contentfor i in {jpg,png,mp4}; dofind -maxdepth 1 \( -iname \*.$i ! -iname \*[A-z\-]0.$i ! -iname \*'XX*XX'.$i ! -iname \*[b-f].$i \) -printf '%f\n' >> FileListdone# FileList now contains all the files in the dir to be counted# Next cut off the filename extensionswhile IFS= read -r i; do echo "${i%.*}" >> FileList1done < FileList# FileList1 is now all the picture files in the directory with no extensions# Next make list of set names#  Cut off picture number onwards within file namesed 's/[0-9]\+[^0-9]*$//' FileList1 > FileList2# Now add mp4 files as sed will cut off part of the extension if done earlier#find -maxdepth 1 \-iname \*.mp4 -printf '%f\n' >> FileList2#find -maxdepth 1 \-iname \*.mp4 -printf '%f\n' >> FileList# Finally cut off any trailing - to give the list of all files with the setname extract from the filename## 1.2 - omitted - trailing dashes cut off later, therefore copy FileList2 to SetNames unalteredcat FileList2 > SetNames#sed 's/[\-]$//' FileList2 > SetNames# Delete all duplicatessort -u SetNames > SetNames2# Setname portion of all picture files is now in SetNames2# Echo version etcecho BM Picture Set Counter Version 1.4 1/10/2023echo -n Counting sets in' 'pwd# Loop through each entry in SetNames2 and use grep to match all files beginning with each of the setnames, and do a count and size on the matches# Print the result on one line per set name using CSV format with directory name (ie. the month&year), set name, empty column (for category manual input later), count, file size# Start loop, get first line of SetNames2 into $iwhile read i ;do # Print directory nameecho -n ${PWD##*/}',' ;# Print set name less any trailing -echo -n $i | sed 's/[\-]$//' ;echo -n ',,' ;# Count files in set and store in $countcount=$(grep ^$i$ SetNames | wc -l) ; #Print count, and size using stat & awkecho -n $count','; grep ^$i FileList | xargs -I '{}' sh -c "echo -n '{} ' | stat -c '%s' '{}'" | awk '{total += $0} END {print total}' ; done < SetNames2# Tidy uprm FileList FileList1 FileList2 SetNames SetNames2

Statistics: Posted by BeebMaster — Sat Jun 01, 2024 10:10 pm



Viewing all articles
Browse latest Browse all 2379

Trending Articles