Thursday, 29 December 2016

PHP curl. Traversing search results

I am working on a website that allows people to search for an 'x' product and display the results in a table format for example.



I am planning on scraping the search data from another website using php curl. (the owner of the website being scraped is aware and allows it, so no legal issues there).



I already have a php curl code to go and login to the website, and do a search based on user inputs. I have no idea how to go thru the results of the search and output then in my website one by one.



PHP curl code:




$username = '********';
$password = '********';
$loginUrl = 'http://www.a-website.com/login.asp';

//init curl
$ch = curl_init();

//Set the URL to work with
curl_setopt($ch, CURLOPT_URL, $loginUrl);


// ENABLE HTTP POST
curl_setopt($ch, CURLOPT_POST, 1);

//Set the post parameters
curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=' . $username . '&password=' . $password . '&submit1=' . 'Login');

//Handle cookies for the login
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie stuff hure');


//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
//not to print out the results of its query.
//Instead, it will return the results as a string return value
//from curl_exec() instead of the usual true/false.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

//execute the request (the login)
$store = curl_exec($ch);

/* * *****************SEARCH HERE****************** */

curl_setopt($ch, CURLOPT_URL, 'http://www.a-website.com/Index.asp');
//execute the request
$content = curl_exec($ch);


//Set the post parameters

curl_setopt($ch, CURLOPT_POSTFIELDS, 'search_txt_vs=' . '' . '&search_txt_UPC=' . '' . '&search_txt_Name=' . $searchString .
'&search_txt_Manufacturer=' . '' . '&submit=' . 'Search');
//execute the request (the search)

$Search = curl_exec($ch);

print CJSON::encode($Search);
print $Search;

//save the data to disk
print $content;


Here is the html code from the website Im scrapping (which btw is in old school table format)






























































































































Sort > NDC
Brand Name
Strength
 |  UD
Stock
Manufacturer
AWP
 /  Your Price
  UPC
Generic Alt/Name
Size
 |  Form
Category
1

[add]

00169347718
NOVOLIN 70/ 30U/ML CRT 5X3 ML

70-30 U/ML
YES
NOVO NORDISK PHARM
$

0.01 


 / $

0.01


000000000000

HUM INSULIN NPH/REG INSULIN HM
5X3ML
 

[return]
INSULIN

2

[add]

00169347418
NOVOLIN N 100 UN/ML CRT 5X3 ML
100 U/ML
YES
NNP
$

0.00 


 / $

0.01


000000000000
NPH HUMAN INSULIN ISOPHANE
5X3ML
 

[return]
INSULIN

3

[add]

00169231721
NOVOLIN INNO 70/30 PFS 5X3 ML
70-30 U/ML

YES
NOVO NORDISK PHARM
$


0.00 


 / $

0.01


000000000000
HUM INSULIN NPH/REG INSULIN HM
5X3ML
 

[return]
INSULIN

4

[add]

00169183311
NOVOLIN R 100 UN/ML VL 10 ML
100 U/ML

YES
NOVO NORDISK PHARM
$

99.00 

 / $


82.09


000169183311
INSULIN REGULAR HUMAN

10ML
 

[return]
INSULIN

5

[add]

00169183711
NOVOLIN 70/ 30U/ML VL 10 ML
70-30 U/ML
YES
NOVO NORDISK PHARM
$

99.00 

 / $


82.09


000169183711
HUM INSULIN NPH/REG INSULIN HM
10ML
 

[return]
INSULIN

6


[add]

00169183411
NOVOLIN N 100 UN/ML VL 10 ML
100 U/ML
YES

NOVO NORDISK PHARM
$

99.00 

 / $

82.09



000000000000
NPH HUMAN INSULIN ISOPHANE
10ML

 

[return]
INSULIN




No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...