Theory behind proof of concept for automatic IVTC decimation

zbrod · February 14, 2010, 9:55pm

Hi again,

This relates to my video editing project. Now that I have a list of
patterns I can (for sake of ease) hard code into my program I have
been googling about file operations. But I have not been able to
find an answer to my question…

This post originally started as a question on file operations, but
seems to have evolved into an outline for my proof of concept program
and what I’d like it to do / how it should work…

Again not asking anyone to code the whole thing for me… But for
questions mentioned in the post, or where it seems I don’t understand
how something works, pointing me in the right direction with links
to tutorials and such, or general advice on better ways to do
something, or if you are inclined to give practical examples on how
to achieve my goals, it would be greatly appreciated.

I am interested in how you would do several different things, such
as:

Start reading a file at a specific line, that is not guaranteed to be
the same line every time… I.e while the [MATCHES] section in a
Yatta Project file might start on line 300 in one case, in another
where the user has done different tasks (more, or less) there may be
more or less information stored, and the [MATCHES] section might not
be on line 300 every single time…

[MATCHES] is the header that proceeds the list of matches for the
frames Yatta is working on. So I assume I need to search the file
and stop at the first occurance of [MATCHES].

Then start reading line by line, one at a time. Each frame is
listed on one line, with one character for the pattern… So a
parttern of CCCNN would be presented as 5 lines in the text file like
so:

C
C
C
N
N

I am supposing I should iterate through each line, and store the
results in an array? I think this is best, because array indices
also start with 0, just like video frame counts. Frame 0 would
amount to Array[0]. Easy to keep track of… If I understand
things right, I don’t need to worry about getting the frame count
beforehand and creating a pre-defined matrix with “X number of
spots” to fill… I can just expand the array on the fly and the
appropriate index number will be given?

(Array.push?)

So the question is, how do I open the yatta project file (just a text
file AFAIK) then search until [MATCH] is found - then start
counting on the actual line the pattern data was started on? There
are no blank lines between [MATCH] and the first frame. However
after the last frame in the video there IS a white space, and then
the next section header [POSTPROCESS] occurs.

So I guess I would want it to iterate through all lines AFTER the
line [MATCHES] is found on, and then stop iterating and pushing
data into the array at the first blank line it comes across (which
is after the last frame in the video).

An abbreviated structure of the section would look like this

[MATCHES]
c
c
c
n
n
c
c
c
n
n
<-- note the white space / blank line
[POSTPROCESS]

The hope is I end up with an array, containing the following (using
the above section example):

MyArray [c, c, c, n, n, c, c, c, n, n] which in order of index
would be… 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 - and would exactly
match the frame count numerations… MyArray[0] == Frame 0 of my
video file… Sound correct so far?

Meanwhile… storing the patterns to match against… Maybe I am
overcomplicating it, but I thought about making object instances
(Pattern 1, Pattern 2, etc) that were arrays themselves and then
using a loop to find the pattern the section matches…

Of course how would you tell Ruby to compare only 5 individual chunks
of the array (0-4, 5-9, 10-14, etc) incrementing by 5 each time so it
doesn’t lose its place, backtrack, etc and counts up properly? I
know how to make an iteration loop to loop through all possible
patterns until a match is found - but not how to specify sections of
an array, much less doing so in a manner that I don’t mess up and
compared 0-4, and then 1-5, or skip/miss things in other ways.

Would it be simpler to simply read in the patterns 5 lines at a
time, keeping each 5 line “chunk” of data inside an array as
mentioned before, but using the object method to reference it, and
store all objects generated inside a master array ?

Like:

MasterList: an array holding a list of all items generated from
reading the file.

Section1: a 5 frame pattern from the file comprised of frames 0 - 4
Section2: same as above but frames 5 - 10
ad nauseum

I suspect I would use object references throughout as well, and not
necesarrily rely on array indices, to keep things simple since by
breaking it up into multiple arrays you could lose track of the real
frame numbers…

Well I’m having too many ideas now… So I should probably cut this
short. Sorry to be so long winded, but hopefully everyone gets an
idea of what I am aiming to do… More than anything I just need an
answer on the file manipulation problems, once I can get the data
into memory, the rest should be pretty simple for me to play around
with and find the best method to use.

-Zach

zbrod · February 14, 2010, 10:06pm

Sorry, more typos against… in the numberings when talking about
frames / array indices. Please note I am aware of the errors…
thanks

I guess I should start using the spellcheck button in Agent…

zbrod · February 14, 2010, 11:34pm

On 2/14/2010 3:55 PM, Zach B. wrote:

data into the array at the first blank line it comes across (which
is after the last frame in the video).

http://pleac.sourceforge.net/pleac_ruby/fileaccess.html
change this example

The method IO#readline is similar to IO#gets

but throws an exception when it reaches EOF

f = File.new(“bla.txt”)
begin
while (line = f.readline)
line.chomp
$stdout.print line if line =~ /blue/
end
rescue EOFError
f.close
end

to
…
$stdout.print line if line =~ /[MATCHES]/
…

add an inner loop after that to process all the lines til you hit your
blank line

zbrod · February 15, 2010, 5:19am

On Sunday 14 February 2010 02:55:07 pm Zach B. wrote:

Start reading a file at a specific line, that is not guaranteed to be
the same line every time… I.e while the [MATCHES] section in a
Yatta Project file might start on line 300 in one case, in another
where the user has done different tasks (more, or less) there may be
more or less information stored, and the [MATCHES] section might not
be on line 300 every single time…

In other words, you want to start reading once you get to [MATCHES].

[MATCHES] is the header that proceeds the list of matches for the
frames Yatta is working on. So I assume I need to search the file
and stop at the first occurance of [MATCHES].

Yes.

There shouldn’t be anything hard about this. Read each line of the file
until
you find one that matches [MATCHES]. You might look at the each_line
method.

I am supposing I should iterate through each line, and store the
results in an array?

Probably. It depend very much what you want to do with them.

If I understand
things right, I don’t need to worry about getting the frame count
beforehand and creating a pre-defined matrix with “X number of
spots” to fill… I can just expand the array on the fly and the
appropriate index number will be given?

(Array.push?)

Yes.

So the question is, how do I open the yatta project file (just a text
file AFAIK) then search until [MATCH] is found - then start
counting on the actual line the pattern data was started on? There
are no blank lines between [MATCH] and the first frame. However
after the last frame in the video there IS a white space, and then
the next section header [POSTPROCESS] occurs.

So let me guess – each header line will be [SOMETHING], right?

So you could just do this:

open ‘filename’ do |file|
lines = file.each_line
line = lines.next.chomp
until line == ‘[MATCH]’
line = lines.next.chomp
end

Now you’re at the MATCH line.

array = []
until line =~ /^\s*$/
line = lines.next.chomp
array.push line
end
end

I’m sorry, it was way easier to just do that than to try to explain how
to do
it.

Meanwhile… storing the patterns to match against… Maybe I am
overcomplicating it, but I thought about making object instances
(Pattern 1, Pattern 2, etc) that were arrays themselves and then
using a loop to find the pattern the section matches…

I have no idea why you would want to do that. But at least get the above
working before you overcomplicate it.

What I would do is use a regex to look for anything that looks like
[FOO], and
store their contents (as arrays) in a hash. Then you would be able to
pull out
the MATCHES section by doing some_hash[‘MATCHES’].

Even better – this file looks suspiciously like an INI file. If it is,
find a
gem to do it for you.

Of course how would you tell Ruby to compare only 5 individual chunks
of the array (0-4, 5-9, 10-14, etc) incrementing by 5 each time so it
doesn’t lose its place, backtrack, etc and counts up properly?

Now you’ve lost me. What’s special about 5?

I
know how to make an iteration loop to loop through all possible
patterns until a match is found - but not how to specify sections of
an array, much less doing so in a manner that I don’t mess up and
compared 0-4, and then 1-5, or skip/miss things in other ways.

Look at Array#splice. Better yet, go to ruby-doc.org and read up on the
Array
documentation.

MasterList: an array holding a list of all items generated from
reading the file.

Section1: a 5 frame pattern from the file comprised of frames 0 - 4
Section2: same as above but frames 5 - 10
ad nauseum

Or you could stick them in a giant array and do something like
each_slice –
that one is in the Enumerable module, which is included in Array.

I suspect I would use object references throughout as well, and not
necesarrily rely on array indices, to keep things simple since by
breaking it up into multiple arrays you could lose track of the real
frame numbers…

Erm… do you need the real frame numbers? That’s easy enough to add.
Something like, say you have a variable called ‘array’ that’s a raw list
of
frames, and you want a list of 5-element frames which each contain the
frame
number… Try this:

array = array.each_with_index.each_slice(5).to_a

zbrod · February 17, 2010, 7:30pm

I think I’ve made some great progress so far… At least, I’ve got
some test code running without errors.

I had to modify David’s code somewhat, however I got it to work…
(it wouldn’t work until I changed the array to a $global - no idea why
unless I was supposed to encompass the code in an anonymous method to
be called with args or something like that?) I also changed the file
open statement to refer to a variable containing the full path to the
file. But hey, its working…

Anyhow. It’ll read the lines into the array how I want, and then I
just divide the array length by 5, and use that number for a time
loop, then it just loops through a process that converts a,b,c,d,e
(array indices) to a joined string, compares that against a variable
holding a pattern of the same length, and if successful puts the
frame range into a new array element, counts up the frame ranges by
5 each, and moves on to the next set… (or if the match fails it
counts up the variables and moves on without making an entry in the
new array).

So… I can’t quite figure out what to do to get them back into a text
file, with each element of the new array being output to a single
line (next element on next line, etc).

Ultimately I want it to go back to the Yatta project file, find the
[NODECIMATE] ranges section, and input the array elements there,
line by line… But just learning how to get it into a blank file,
line by line will suffice for now… I can copy/paste the ranges into
the Yatta project to test the results easily enough.

I haven’t found any helpful examples on the web, or in documentation
though. Maybe I’m not looking in the right places, but the best I
found was an example of using to_yaml to store an array in a YAML
file… which doesn’t really help me.

Any simple examples would be greatly appreciated.

Here is what I’ve accomplished so far. It is most likely horribly
inefficient or something along those lines, but I think in a verbose
step-by-step manner and its easier as a newcomer with limited
programming time under my belt, to understand what I was trying to
accomplish when I come back to stuff later.

Note: eventually I’ll write something in to calculate the number of
times to run the loop, but I just did this to see if I was actually
going anywhere. I also shifted from my original goal to something
easier to start with… which is identifying 30fps progressive
frames, which will be marked as not to be decimated. Even with just
those changes, I could let Yatta handle the rest of the IVTC /
decimation, as its the 30p sections that screw up most IVTC pattern
locks anyway.

path = “L:/D-Note/Ep 01/Episode 01.d2v.yap”
open path do |file|
lines = file.each_line
line = lines.next.chomp
until line == ‘[MATCHES]’
line = lines.next.chomp
end

Now you’re at the MATCH line.

$array = []
until line =~ /^\s*$/
line = lines.next.chomp
$array.push line
end
end

$nodecimate = []

$a = 0
$b = 1
$c = 2
$d = 3
$e = 4
7203.times do
match_for = “ccccc”
pattern = $array.values_at($a,$b,$c,$d,$e).join
if pattern == match_for
$nodecimate.push $a.to_s + “^” + $e.to_s + “^0”

$a += 5
$b += 5
$c += 5
$d += 5
$e += 5

else
$a += 5
$b += 5
$c += 5
$d += 5
$e += 5
end
end

snippet of the results, which is exactly what I’m looking for.

0^4^0
5^9^0
10^14^0
15^19^0
20^24^0
25^29^0
30^34^0
790^794^0
795^799^0
800^804^0
810^814^0
980^984^0

I was thinking it would be nice to eventually write some code that
would also compare the results of several frame range matches… So
instead of back to back entries for frames 0 to 34, it could chop
it down to one entry of 0^34^0

But it definitely does what I want so far… I noticed it seems to
have one more frame than is counted in the project file (under its
frame count listing) but that may be a quirk I have noticed between
other applications before. I’m pretty sure its reading all of them.
Unless its inserting an array entry for the white space line it looks
for, to tell it to stop… not sure yet. The frames in the beginning
are definitely accurate though.

Couldn’t have gotten this far without all the previous help though, so
thanks again.

-Zach

zbrod · February 15, 2010, 8:31pm

Thanks for those two replies. That should set me on the right track.

David:
The reason 5 is an important number for me, is because I am
performing an Inverse Telecine operation on DVD content. Telecining
is a process by which 24fps film material is made suitable for NTSC
broadcast television (30fps, or 60 fields per second). However I
am dealing with a lot of Anime, which by nature of the industry, is
actually Hybrid content. It contains both 24fps telecined content,
and genuine 30fps content.

How can you possibly display both at the same time within the same
stream? Well due to the way TV works, by interlacing two fields of
data, you can telecine the 24 fps content to a speed of 30fps
content by field blending.

There is a lot of math involved that I don’t understand, but basically
it involves blending the fields of two frames out of every set of 5.
There is a top field and a bottom field to every image you see on TV.
By blending these you can trick the viewer into seeing 24fps material
as it was intended to be seen, even though its working inside a 30fps
stream.

Here is a link explaining it…

What i am doing is reversing the process… However because I am
working with Hybrid video, it is not that simple. I actually have to
take note of any 30FPS progressive sections, and treat them
differently than the 24FPS Telecined sections that I will be
decimating. As for the 24fps sections themselves, I need to
examine 5 frames at a time, to mark a specific frame (and this is
part of the proof of concept for what I am doing) that I believe is
the duplicate which needs to be deleted from the stream.

So I need to look at 5 values in the array at a time, comparing them
against a master list of pattern matches, so I can tell the program
what to do when it finds a match with one of the possible patterns. I
also need to be able to tell it NOT to decimate any identified 30fps
patterns it will come across, and potentially mark them so I can
make an MKV timecodes file with their frame numbers later on…
depends on how much I want to do within Yatta itself.

That’s why I need to work in 5 frame blocks with the arrays. Hope
you understand better what I am aiming to do… The reason I am going
through all this is because no one has developed a tool that can
reliably perform this operation, specifically on Hybrid Anime
content, without producing errors. YATTA is used to perform a
completely manual IVTC operation, with the user reviewing pattern
matches and marking frames for decimation/postprocess
deinterlacing/filtering and what have you.

I want to try to automate the IVTC portion of this work. It is
arguably the most time consuming and error prone part of working with
YATTA. It can take a user between 5 - 10 hours to manually IVTC a
20 minute episode. More closer to 5 hours IF they are very
experienced.