I’m trying to write a script to download my edX course videos for me.
This
is what I’ve written but it doesn’t seem to be working. can someone help
me
out please?
require ‘open-uri’
require ‘rubygems’
File.open(“video.mp4”, “w+”) do |file|
open(" Authn | edX")
do |read_file|
file.write(read_file.read)
end
end
Here’s to the crazy ones. The misfits. The rebels. The troublemakers.
The
round pegs in the square holes. The ones who see things differently.
They’re not fond of rules. And they have no respect for the status quo.
You
can quote them, disagree with them, glorify or vilify them. About the
only
thing you can’t do is ignore them. Because they change things. They push
the human race forward. And while some may see them as the crazy ones,
we
see genius. Because the people who are crazy enough to think they can
change the world are the ones who do.*
I’m trying to write a script to download my edX course videos for me. This
is what I’ve written but it doesn’t seem to be working. can someone help me
out please?
When I try to download whatever points to your URL with ‘wget’, I do
not get any sort of MP4 video stream - just a HTML warning page. If
you are sure about the URL, this probably means that successful
communication to the website requires the setting of some sort of
cookies, and/or other sorts of out-of-band data exchange. You need to
make sense of the complete conversation that takes place between your
browser and the site, and then mimic the conversation.
This is not an easy task, and requires you to learn a lot of
things. Both for sniffing and for mimicing, Ruby can be an excellent
help, but the only way I know requires learning the ins and outs of
how HTTP conversation takes place. If you type on your terminal
ri Net::HTTP
do you understand what you read? If you don’t, then maybe the task is
too large a step for you.
Maybe someone else on the list can propose a ready-made saolution…
It seems that in order to access that video, you first have to login.
If you check the file you generated, it’s an html file with a form to
login.
In order to web scrape like you need, I recommend using the mechanize
gem, with which you can direct the navigation, fill out forms, submit,
follow links, etc programmatically. If the site doesn’t require
javascript to work it works fine. I use it a lot to automate tasks on
websites. Here an example to post to an internal wiki we use at work:
agent = Mechanize.new
agent.user_agent_alias='Linux Mozilla'
#get login page
page = agent.get(URL)
# do login
form = page.form("login_form")
form.username = configuration["WikiUser"]
form.password = configuration["WikiPassword"]
page = agent.submit(form, form.buttons.first)
# go to B!Wiki
page = page.link_with(:text => "B!Wiki").click
# Submit a form to login in the wiki
page = page.form("userlogin").submit
page = page.link_with(:text => "Welcome").click
# find the page and edit it
calendar_page = configuration["WikiPage"] % year
page = agent.get(calendar_page)
page = page.link_with(:href =>
/action=edit§ion=#{month+1}/).click
# insert the new value for the table
form = page.form("editform")
form.wpTextbox1 = content
# submit
agent.submit(form, form.buttons.first)
If the download requires some sort of login, then he’d have to read
the fine print of what he subscribed to when getting his
login. Otherwise, as far as I know, nothing would bind him to Edx. But
IANAL.
Carlo
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.