Efficient implementation of saving search state

Hi everyone, this is my first post to this forum and a new Ruby
hobbyist. I have a newb question that I am hoping you more experienced
folks can help me out with.

I am currently developing some ruby code that uses the File.find method
to perform a depth first (by design) search of a directory tree. No big
issues with performing that; however, I want to build in the ability to
save application state if the app is forced close or on error.
Specifically, if I have to search an extremely large directory
structure; I would like to start at the location within the directory
tree that was last reached prior to application shutdown. Also, the
directory structure is projected to be millions of files, so I need to
take that into consideration, as well.

Initially, I was looking into Marshal and Pstore, but I am leaning
towards potentially implementing a database to index the search paths. I
am asking for any guidance to tutorials, design patterns, and anything
else that may be relevant.

Any and all comments are welcome. Thanks in advance for everyone’s help.

–Chris

2010/2/28 User B. [email protected]:

tree that was last reached prior to application shutdown. Also, the
directory structure is projected to be millions of files, so I need to
take that into consideration, as well.

Initially, I was looking into Marshal and Pstore, but I am leaning
towards potentially implementing a database to index the search paths. I
am asking for any guidance to tutorials, design patterns, and anything
else that may be relevant.

Any and all comments are welcome. Thanks in advance for everyone’s help.

Welcome! I am not sure what you think is wrong with using Marshal.
Wouldn’t it be sufficient to store a stack of currently processed
directories to a file and overwrite that file every time you have
finished reading one directory? I believe the volume should be
sufficiently small to be handled with Marshal.

Kind regards

robert

Robert K. wrote:

2010/2/28 User B. [email protected]:

tree that was last reached prior to application shutdown. Also, the
directory structure is projected to be millions of files, so I need to
take that into consideration, as well.

Initially, I was looking into Marshal and Pstore, but I am leaning
towards potentially implementing a database to index the search paths. I
am asking for any guidance to tutorials, design patterns, and anything
else that may be relevant.

Any and all comments are welcome. Thanks in advance for everyone’s help.

Welcome! I am not sure what you think is wrong with using Marshal.
Wouldn’t it be sufficient to store a stack of currently processed
directories to a file and overwrite that file every time you have
finished reading one directory? I believe the volume should be
sufficiently small to be handled with Marshal.

Kind regards

robert

Robert, thanks for your response. Your comment makes perfect sense. I am
saving the string of the last directory searched, then Marshaling to a
file. Everything appears to be working as it should. Thanks again.

2010/3/4 User B. [email protected]:

Any and all comments are welcome. Thanks in advance for everyone’s help.

Welcome! I am not sure what you think is wrong with using Marshal.
Wouldn’t it be sufficient to store a stack of currently processed
directories to a file and overwrite that file every time you have
finished reading one directory? I believe the volume should be
sufficiently small to be handled with Marshal.

Robert, thanks for your response. Your comment makes perfect sense. I am
saving the string of the last directory searched, then Marshaling to a
file. Everything appears to be working as it should. Thanks again.

Just for the fun of it, here’s my solution:

Note, somehow I assumed from your posting that you want to process
folders after their contained files and sub folders in the same way
as find’s option “-depth” does it.

Kind regards

robert