welcome to my channel today we are going
to be playing with PHP a little bit and
be most specific we are going to be
grabbing some data stripping some data
off of my own website that built so
let's say you're coming across the
website where you want certain amount of
content and you're just tired of copying
and pasting and you just want to put it
into a text file that you can use or
kind of consolidate things together
so like Here I have these articles that
I've written up and instead of having to
go and click through all of these one by
one I just wanted to grab all of these
grab them all copy them and paste them
but
in order to copy and paste on my also
wanted the format which I can't really
get when looking at the article or the
source code right here
so what I had to do was okay if I want
the article to look all pretty and stuff
like that ought to go into the source
code into the elements and grab all this
stuff which is fine daddy
we did everything that I needed to so
source problem
well what happens if
later on I wanted some more I want some
more data or wanted to automate this so
that it went on to capture the data for
me i don't have to run anything i could
just let my computer run in the
background and anytime someone that
updated this page
I'd be able to get a notification on my
phone and also the message of what it
contained in order to do that I have to
do a little bit of snooping around so I
happen to know exactly how the site
works works because I built it so we're
going to network and these basic steps
kind of applied to all of the sites are
going to build yourself for that you're
going to need to
then Data Miner however you do with your
data that you will
so enter okay here we go
got some network traffic now what this
number traffic is is all these different
things that came up for all the
different requests that the browser made
to the internet to say i need this data
give it to me
so the first request made was for the
domain name itself and it came back with
this source code right here that your
browser that interpreted and displayed
on the screen stay
that's fantastic
so you
after that I
it was up to the javascript and HTML to
decide what to do so it went through and
I got some get out markdown got some
some index CSS cloudflare stuff rocket
stuff I gotta
take file and it got a the Deaf setup so
that's right here matches this
information right here so i go back to
the conflict files like oh ok there
there's dev get setup which is this
right here and it has a title of get
setup which i'm guessing that this is
how the articles populated I know I know
exactly that starts with like that but
if you didn't that's how you would know
we have kind of like a guessing idea
this is ok so if I goes to go to this
address right here in my browser which
there is there's there's the file that's
cool so I can get the raw md file
and markdown file
itself which is pretty nifty so I go and
I say okay let's go back and let's load
go back to the JSON and we got that
other ND file analysis of this empty
file
ok now that we've got this empty file
let's go back and let's go to the
json config and let's go and get this
empty file this is all fine and dandy
for like two or three of them but what
if there's like 50 of them
well you
computer can do and can do really really
quickly so let's build an application to
do it for us in order to do that you can
do it in several different languages i
prefer PHP just because
that's what I work the most end I could
do a java version of this or Python
version of this first Seaver eating this
oh please don't please don't make me to
see
playing with
now that would
all
don't ever make
a
swingers website skinny application and
see unless you are really
the challenge of trying to get all that
stuff to work higher level language
works just fine because the amount of
time that it takes to
go out and retrieve the data is hundreds
of thousands times slower than it is to
process the data itself so i just set up
this little
server on my own machine little
server with PHP running there's a link
in the description of how do that
yourself is pretty straightforward
I'm actually there's another link right
here on the internship project
to look at it if you go to the apt EK I
think it's right here for windows and
right there for mac and just follow that
tutorial estimating three hours you have
no idea what you're doing if you have
everything if you knew what you're doing
you've done this before then it'll take
me like 10-15 minutes it took me 10 or
15 minutes to this so that's ok that's
all fine and dandy
alright so the server and it says
newegg.com the helloworld here to
actually go to New York com
whoo domain name service doesn't exist
it'll bring it to whoever you're closest
to make name registrar is and saying hey
we can send you somewhere i do this what
I set up the hosts file to point you
have dot-com to the local IP address
which is 127 001 and/or colon colon 14
ip6
that's cool what does that mean that
means almost nothing i can't get this to
work i guess i have to tell the world
where the fellow was coming from it's
coming from this file right here and
this file says echo hello world
ok it's like bringing out hello world
that's fine and dandy but i want to
write a few lines of logic in here to
allow me to go and access this file
right here
so I need to go and
this file and then parse through this
data grab the each one of these links
and then grab the data that in those
links so that number one starter access
this file so the access file you can
there's a function called file get
contents a file name has to be a
strength feelings okay that will simply
just get the content if i was to echo it
out right here it would simply if you
can guess what it will do it will simply
echo out that right there in a very
unpretty way look at the source animal
it's not like this but because there are
no brake elements are no er elements
those those stop right there since i'm
not one of those don't like this unless
you do something fancy which I like to
do when just echoing out some some stuff
to coach and raw rock code slippery
around it which basically tells the
browser
hey this is code act like it
so there we go to use the new line
characters break elements good
done done that good so far well I didn't
want to string this is a string
how can I get just like this right here
luckily for us this is a json file and
there's a
json decode function that allows us to
decode the JSON
education string that we get from the
file get contents if we're to echo that
wouldn't do anything with break because
what returns is an array that
turns an object if I don't do true you
do true then it returns an array so okay
I have this have this object these are
now pointless to be right there
Oh Michael stuff out later to the chase
an array so now if i was to go into each
one of these
so for each one of the elements in the
JSON array so for each one of these
elements in the JSON array
so one
three four five six or six elements 5 if
you're counting from zero
don't worry so we have you Jason bere
with six elements so for each one of
those items
let's go ahead and print something
let's go ahead and print out the URL so
inside each one of these items there's
an associative array when associative
array means is that there's a key and a
value to the array so if i do item title
co that out
and it's gonna look really really messy
title is gonna look really messy haha
messy connected all the tiles together
ok that's all we want your else so it's
got the URLs ok
even mortgage messy but that's what we
wanted
so okay now that we have the URL to the
URLs
well we have the file get contents we're
going to go get the contents of those
URLs and if you were to look at the
source again they start as a relative
path will need to add the new apps dot
TK in front of it in order to actually
go and get the things we need
otherwise we'll do a google search if I
was put into the browser and it'll just
kind of fall apart we do it any other
way
so can mate
getting that on the set
can any of that on can catenate just
you're all right there that gets the
contents of
I guess everything that's inside of
whatever file we're looking at so that
is our stuff
if I was
right now it all be I think we'll grab
everything cool
everything so there's like something
down here that stopped and something
there and something
more stuff let's let's make it a little
bit more easier to read
echo item title but
let's also put some
to make it bigger
the second the title
it's a bunch of
kisses all around it
we'd love some
complete stuff that's very easy to see
there ya as you can help set up
the get out of debt
anything
so yeah the
too simple I
code are able to grab the data from
where you need to get it to and
pretty simple pretty straightforward to
be able to get the data you need
a systematic way really really quickly
from
simple rest api is like get and post
we can get
plicated more obvious cated depending on
if there's javascript involves and
loading certain elements of the page and
you'll have to parse that out and figure
out how to emulate
javascript via PHP or Python or
c sharp
really up to you but I hope this
get some some data
and from what I said I to get data that
will help you learn how to use file get
contents json decode and
using how you see fit as long as it's
legal and
yeah okay
so thank you all so much for watching
I'll see you in the next video
good luck and farewell