SA-MP Forums

Go Back   SA-MP Forums > SA-MP Scripting and Plugins > Plugin Development

Reply
 
Thread Tools Display Modes
Old 12/11/2018, 05:44 PM   #1
SyS
High-roller
 
SyS's Avatar
 
Join Date: Oct 2015
Posts: 2,030
Reputation: 493
Default PawnScraper

PawnScraper




A powerful scraper plugin that provides interface for utlising html_parsers and css selectors in pawn.

Installing

Thanks to Southclaws,plugin installation is now much easier with sampctl

PHP Code:
sampctl p install Sreyas-Sreelal/pawn-scraper 
OR
  • Download suitable binary files from releases for your operating system
  • Add it your plugins folder
  • Add PawnScraper to server.cfg or PawnScraper.so (for linux)
  • Add pawnscraper.inc in includes folder

Building
  • Clone the repo

    PHP Code:
    git clone https://github.com/Sreyas-Sreelal/pawn-scraper.git 
  • Compile the plugin using nightly compiler
    • Windows
      PHP Code:
      cargo +nightly-i686-pc-windows-msvc build --release 
    • Linux
      PHP Code:
      cargo +nightly-i686-unknown-linux-gnu build --release 

Example Usage

A small example to fetch all links in wiki.sa-mp.com

PHP Code:
new Response:response HttpGet("https://wiki.sa-mp.com");
if(
response == INVALID_HTTP_RESPONSE){
    
printf("HTTP ERROR");
    return;
}

new 
Html:html ResponseParseHtml(response);
if(
html == INVALID_HTML_DOC){
    
DeleteResponse(response);
    return;
}

new 
Selector:selector ParseSelector("a");
if(
selector == INVALID_SELECTOR){
    
DeleteResponse(response);
    
DeleteHtml(html);
    return;
}

new 
str[500],i;
while(
GetNthElementAttrVal(html,selector,i,"href",str)){
    
printf("%s",str);
    ++
i;
}
//delete created objects after the usage..
DeleteHtml(html);
DeleteResponse(response);
DeleteSelector(selector); 

The same above with threaded http call would be

PHP Code:
HttpGetThreaded(0,"MyHandler","https://wiki.sa-mp.com");
//...
forward MyHandler(playerid,Response:responseid);
public 
MyHandler(playerid,Response:responseid){
    if(
responseid == INVALID_HTTP_RESPONSE){
        
printf("HTTP ERROR");
        return 
0;
    }

    new 
Html:html ResponseParseHtml(responseid);
    if(
html == INVALID_HTML_DOC){
        
DeleteResponse(response);
        return 
0;
    }

    new 
Selector:selector ParseSelector("a");
    if(
selector == INVALID_SELECTOR){
        
DeleteResponse(response);
        
DeleteHtml(html);
        return 
0;
    }

    new 
str[500],i;
    while(
GetNthElementAttrVal(html,selector,i,"href",str)){
        
printf("%s",str);
        ++
i;
    }

    
DeleteHtml(html);
    
Delete(response);
    
DeleteSelector(selector);
    return 
1;



More examples can be found in examples

Repository
https://github.com/Sreyas-Sreelal/pawn-scraper

Note

The plugin is in primary stage and more tests and features needed to be added.Iím open to any kind of contribution, just open a pull request if you have anything to improve or add new features.This plugin was written inorder to get around with rust and sharpen my skills in it.

Special thanks

Last edited by SyS; 30/11/2018 at 01:53 PM.
SyS is offline   Reply With Quote
Old 12/11/2018, 05:59 PM   #2
Gabriel432135
Little Clucker
 
Join Date: Nov 2018
Posts: 3
Reputation: 0
Default Re: PawnScraper

cool
Gabriel432135 is offline   Reply With Quote
Old 12/11/2018, 06:12 PM   #3
kvann
Huge Clucker
 
kvann's Avatar
 
Join Date: Jun 2012
Location: Estonia
Posts: 395
Reputation: 152
Default Re: PawnScraper

hot.
__________________




kvann is offline   Reply With Quote
Old 12/11/2018, 08:44 PM   #4
Ermanhaut
Gangsta
 
Ermanhaut's Avatar
 
Join Date: Apr 2016
Location: 2369.5547, -1681.9297, 15.0078
Posts: 634
Reputation: 44
Default Re: PawnScraper

This is really good.
__________________
  • Estou desenvolvendo por dinheiro, contate-me com sua proposta.
Ermanhaut is offline   Reply With Quote
Old 15/11/2018, 09:38 PM   #5
Chaprnks
Gangsta
 
Chaprnks's Avatar
 
Join Date: Sep 2007
Location: Soviet America
Posts: 752
Reputation: 64
Default Re: PawnScraper

Amazing! Finally a well-rounded solution to the HTTP() function
__________________
Chaprnks is offline   Reply With Quote
Old 24/11/2018, 01:39 PM   #6
SyS
High-roller
 
SyS's Avatar
 
Join Date: Oct 2015
Posts: 2,030
Reputation: 493
Default Re: PawnScraper

New version released!

https://github.com/Sreyas-Sreelal/pa...ases/tag/0.1.0

Changes
  • Added HttpGetThreaded
  • Changed reqwest to minihttp
  • Smaller binary

Still might need more tests but the basic functionalities are working as expected.Big thanks to Eva who patiently listened to my questions and doubts and for giving me guidance in certain parts.

Usage of HttpGetThreaded
pawn Code:
HttpGetThreaded(0,"MyHandler","https://wiki.sa-mp.com");
//...
forward MyHandler(playerid,Response:responseid);
public MyHandler(playerid,Response:responseid){
    if(responseid == INVALID_HTTP_RESPONSE){
        printf("HTTP ERROR");
        return 0;
    }

    new Html:html = ResponseParseHtml(responseid);
    if(html == INVALID_HTML_DOC){
        DeleteResponse(response);
        return 0;
    }

    new Selector:selector = ParseSelector("a");
    if(selector == INVALID_SELECTOR){
        DeleteResponse(response);
        DeleteHtml(html);
        return 0;
    }

    new str[500],i;
    while(GetNthElementAttrVal(html,selector,i,"href",str)){
        printf("%s",str);
        ++i;
    }

    DeleteHtml(html);
    Delete(response);
    DeleteSelector(selector);
    return 1;
}

Last edited by SyS; 30/11/2018 at 01:52 PM.
SyS is offline   Reply With Quote
Old 24/11/2018, 06:13 PM   #7
Infin1ty
Big Clucker
 
Join Date: Feb 2018
Posts: 97
Reputation: 34
Default Re: PawnScraper

no
no you didnt
:O
__________________

le communist bork
-
I'll script whatever you need for you for a price. Contact me on Discord - Bork#2540 or Telegram - @Borker.
I can do stuff in both Pawn and Python (which means Discord bots, Telegram bots and more).
Infin1ty is online now   Reply With Quote
Old 26/11/2018, 01:14 PM   #8
AmirSavand
Big Clucker
 
AmirSavand's Avatar
 
Join Date: Sep 2018
Location: Behind Schedule
Posts: 78
Reputation: 8
Default Re: PawnScraper

SAMP http requests are known to fail without a reason so does the http calls here always succeed without bugs?
__________________

GitHub - Website - Contact

C# - Python - PHP - Angular
Unity 3D - Django - Electron

AmirSavand is offline   Reply With Quote
Old 26/11/2018, 01:18 PM   #9
SyS
High-roller
 
SyS's Avatar
 
Join Date: Oct 2015
Posts: 2,030
Reputation: 493
Default Re: PawnScraper

Quote:
Originally Posted by AmirSavand View Post
SAMP http requests are known to fail without a reason so does the http calls here always succeed without bugs?
Http requests is working fine as per the tests,if you encountered any bugs open an issue on github. But do note that main scope of this plugin is not sending http requests (plugin can only be used to send GET requests ) but parsing HTML doc and using CSS selectors. Southclaw' requests plugin already gives a better solution to http requests.
SyS is offline   Reply With Quote
Old 26/11/2018, 02:52 PM   #10
fiki574
Gangsta
 
fiki574's Avatar
 
Join Date: Mar 2011
Location: Croatia
Posts: 845
Reputation: 169
Default Re: PawnScraper

Nice work!

However, is there any way to send a HTTP request towards the SAMP server instead of only external URLs?
__________________
fiki574 is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT. The time now is 04:46 PM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.