How to identify unpaired files in a list

Hello there. I have a problem where I am trying to identify unpaired
files in a directory.

The directory may have files like the following:

abcd0001.txt
abcd0001.def.bak
abcd0002.txt
abcd0003.txt
abcd0003.ghi.bak
abcd0004.xyz.bak

What I’d like to do is identify the unpaired files ‘abcd0002.txt’ and
‘abcd0004.xyz.bak’ in the list above.

I’ve tried bringing all the filenames into a single array but I’m not
sure how to delete items that have duplicate root names. Deleting
duplicates is trivial, of course, but the extensions on the 2nd files
always change.

I also tried creating 2 separate arrays (one for each file type) so
that I can compare the root filenames, but then I’m left with looping
through one array many times… which will have a huge performance hit
when the filenames number in the 1,000’s.

Does anyone have any good suggestions that they can offer?

Please let me know. Thanks. Paul.

I also tried creating 2 separate arrays (one for each file type) so
that I can compare the root filenames, but then I’m left with looping
through one array many times… which will have a huge performance hit
when the filenames number in the 1,000’s.

Does anyone have any good suggestions that they can offer?

Perhaps nested hashes.
=r

Perhaps nested hashes.

Oops I meant hashes of arrays, i.e. final data structure:

{‘abcd0001’ => [‘abcd0001.txt’, ‘abcd0001.bak.txt’], ‘abcd0003’ =>
[‘abcd0003’]}

then iterate through looking for arrays with length 1 only.
GL.
=r
filename[’

On Jul 18, 5:39 pm, Roger P. [email protected] wrote:

Perhaps nested hashes.

Oops I meant hashes of arrays, i.e. final data structure:

{‘abcd0001’ => [‘abcd0001.txt’, ‘abcd0001.bak.txt’], ‘abcd0003’ =>
[‘abcd0003’]}

then iterate through looking for arrays with length 1 only.
GL.
=r

That’s cool. I’ll give it a try. Thanks.

I had the same requirement and ended up using Roger’s method except
with an array of arrays:
[[‘abcd0001.txt’, ‘abcd0001.bak.txt’],[‘abcd0003’], etc.]