LogFAQs > #922713710

LurkerFAQs, Active DB, DB1, DB2, DB3, DB4, Database 5 ( 01.01.2019-12.31.2019 ), DB6, DB7, DB8, DB9, DB10, DB11, DB12, Clear
Topic List
Page List: 1
TopicI need some python help. Is anyone strong in web scraping with Python?
WiggumFan267
06/02/19 10:47:07 PM
#1:


I have a task I have to do, basically it is to scrape a reddit-like page (https://news.ycombinator.com/) and pull together a list of articles with:
-Article name
-URL
-Submitter
-Submission time
-Number of upvote points
-Number of comments

I am a Python semi-newb and completely new to anything like web scraping, so wanted to know if anyone could help?

After researching some articles, I got as far as importing the packages "requests" and "BeautifulSoup", but I'm really not sure at all how to use these packages to get what I need.

I see that , for example:
article name always appears betweens the string 'class="storylink">' and '</a>
URL is between '<a href=' and 'class="storylink">'
Submitter between 'class="hnuser">' and '</a>'
Submission time/Upvote points/comments im not sure because they seem to always precede a unique ID

so i don't know if this idea is the right track or not, and if it is, how to go about it in general, or if not a better way to do it? Like I guess I would just search for each group of text strings and find what is between them, but I'm not sure if that's right, or entirely how to do that.

any help would be appreciated
---
~Wigs~ 3-Time Consecutive Fantasy B8 Baseball Champion
2015 NATIONAL LEAGUE CHAMPION NEW YORK METS
... Copied to Clipboard!
Topic List
Page List: 1