Python

[Python] 파이썬 웹 크롤링 2탄 : 아마존 베스트 셀러 카테고리(Amazon Best Sellers)별 상품 리스트 가져오기 : ul ,ol,li 태그 파싱하는 방법

오늘은 아마존 사이트의 베스트 셀러 페이지의 카테고리별 아이템(상품) 리스트를 긁어오는 작업을 해봅니다.

방법은 어렵지않으나 html에 대한 사전 지식이 필요합니다. Any Department 카테고리는 ul과 li 태그로 구현되어 있습니다. 이 카테고리 명칭을 가져와서 번호를 매긴 후 번호를 입력받아 상품 리스트를 가져오는 방법을 알아보겠습니다. 크롬 브라우저 개발자도 도구를 열고 (F12키 클릭) 대상을 찾습니다.

 

작업 대상 : www.amazon.com/bestsellers


카테고리 명칭을 가져오는 파이썬 스크립트는 다음과 같습니다. BeautifulSoup클래스를 사용하여 크롤링을 진행합니다.

find()함수와 find_all()함수를 사용하여 ul 태그와 li태그를 찾아서 파싱합니다. 그리고 리스트에 담아요.

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = 'https://www.amazon.com/bestsellers'
request = urllib.request.urlopen(url)
html = request.read()
beautifulSoup = BeautifulSoup(html, 'html.parser')

find_ul = beautifulSoup.find('ul', id='zg_browseRoot').find('ul')
li_list = find_ul.find_all('li')


# 링크
links_list = get_link_list(li_list)

# 카테고리 리스트 번호 매김
select_list = get_category_numbering(li_list)

카테고리 명칭 가져온 후 넘버링하는 함수입니다.

def get_category_numbering(list):
    rtn_list = []
    count = 1
    rtn_list.append("### 아마존 베스트 셀러 카테고리(Amazon Best Sellers)nn")
    for item in list:
        rtn_list.append(str(count) + "." + item.get_text())
        count += 1

    return rtn_list

다음은 카테고리별 url를 가져오기 위해 a태그를 파싱하여 리스트를 반환하는 함수입니다.

def get_link_list(list):
    count = 0
    rtn_list = {}
    for item in list:
        # print(item.find('a', href=True).attrs['href'])
        link = item.find('a', href=True).attrs['href']
        count = count + 1
        rtn_list[count] = link

    return rtn_list

 

다음 작업은 작업자(사용자)에게 카테고리 리스트를 보여주고 입력하게 하는 스크립트입니다. 한 줄에 5개씩 분리하였습니다.

# 1줄에 5개씩 분리
category_department = []
cnt = 1
for item in select_list:
    category_department.append(item + 'tt')

    if (cnt % 5) == 0:
        category_department.append('n')

    cnt += 1

category_department.append('nn' + '카테고리를 선택하세요: ')
# print(str(''.join(category_department)))
# input_val = input(str(''.join(category_department)))

input_val = input(str(''.join(category_department)))  # type str




#실행결과
C:UsersilikeAppDataLocalProgramsPythonPython39python.exe C:/python/Workspace/main.py
### 아마존 베스트 셀러 카테고리(Amazon Best Sellers)

		1.Amazon Devices & Accessories		2.Amazon Launchpad		3.Amazon Pantry		4.Appliances		
5.Apps & Games		6.Arts, Crafts & Sewing		7.Audible Books & Originals		8.Automotive		9.Baby		
10.Beauty & Personal Care		11.Books		12.CDs & Vinyl		13.Camera & Photo		14.Cell Phones & Accessories		
15.Clothing, Shoes & Jewelry		16.Collectible Currencies		17.Computers & Accessories		18.Digital Educational Resources		19.Digital Music		
20.Electronics		21.Entertainment Collectibles		22.Gift Cards		23.Grocery & Gourmet Food		24.Handmade Products		
25.Health & Household		26.Home & Kitchen		27.Industrial & Scientific		28.Kindle Store		29.Kitchen & Dining		
30.Magazine Subscriptions		31.Movies & TV		32.Musical Instruments		33.Office Products		34.Patio, Lawn & Garden		
35.Pet Supplies		36.Software		37.Sports & Outdoors		38.Sports Collectibles		39.Tools & Home Improvement		
40.Toys & Games		41.Video Games		

카테고리를 선택하세요: 4

선택을 하면 상품리스트페이지로 이동됩니다. 여기서 또 한 번의 html파싱 작업이 들어갑니다.


선택을 하고 나면 선택한 카테고리 링크값을 다시 request 합니다.  이번에는 ol , li태그를 가져와서 파싱합니다.

print(f'{input_val}번이 선택되었습니다.')
target_link = links_list.get(int(input_val))
print(f'대상 링크:{target_link}')
request.close()

request = urllib.request.urlopen(target_link)
html = request.read()
beautifulSoup = BeautifulSoup(html, 'html.parser')

find_ol = beautifulSoup.find('ol', id='zg-ordered-list')
li_list = find_ol.find_all('li')

다음은 상품별로 상품상세 링크 주소, 상품명, 가격, 이미지URL등의 정보를 파싱처리 후 리스트에 담습니다.

got_item_list = []

for item in li_list:
    link = "https://www.amazon.com" + (item.find('a', href=True).attrs['href'])
    title = item.find('a', href=True).text.replace("nn            ", "").replace("n        n", "")
    price = item.find('span', attrs={'class': 'p13n-sc-price'}).text
    img_url = item.find('div', attrs={'class': 'a-section'}).img['src']

    got_item_list.append([title, price, link, img_url])


print("총 개수 : ", len(got_item_list))
print(got_item_list[0])
print('-'*50)
for item in got_item_list:
    print(item)

href속성값을 가져오는 또 다른 방법은 a태그를 찾은 후 get()함수를 써서 href 값을 가져올 수 있습니다.  find_all(‘a’)사용하는 경우 모든 href 속성값을 가져올 수 있습니다.

for item in li_list:
    link = "https://www.amazon.com" + (item.find('a').get("href") 
    ....생략 
 

[실행결과]

4번이 선택되었습니다.
대상 링크:https://www.amazon.com/Best-Sellers-Appliances/zgbs/appliances
총 개수 :  50
['GE Profile Opal | Countertop Nugget Ice Maker', '$465.99', 'https://www.amazon.com/GE-Profile-Countertop-Nugget-Maker/dp/B07YF9SGBW?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61NmZOf6f4L._AC_UL200_SR200,200_.jpg']
--------------------------------------------------
['GE Profile Opal | Countertop Nugget Ice Maker', '$465.99', 'https://www.amazon.com/GE-Profile-Countertop-Nugget-Maker/dp/B07YF9SGBW?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61NmZOf6f4L._AC_UL200_SR200,200_.jpg']
['Euhomy Ice Maker Machine Countertop, 26 lbs in 24 Hours, 9 Cubes Ready in 8 Mins, Electric ice maker and Compact potable ice maker with Ice Scoop and Basket. Perfect for Home/Kitchen/Office.(Sliver)', '$109.99', 'https://www.amazon.com/Countertop-Machine-hrs-Ice-Compact-Lightweight/dp/B07R56HW4G?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71gk02TTfNL._AC_UL200_SR200,200_.jpg']
['Crownful Ice Maker Countertop Machine, 9 Ice Cubes Ready in 8-10 Minutes, 26lbs Bullet Ice Cubes in 24H, Electric Ice Maker with Scoop and Basket - Black', '$95.99', 'https://www.amazon.com/CROWNFUL-Countertop-Machine-Minutes-Electric/dp/B087B9YCX4?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61cVLVRavqL._AC_UL200_SR200,200_.jpg']
['IKICH Portable Ice Maker Machine for Countertop, Ice Cubes Ready in 6 Mins, Make 26 lbs Ice in 24 Hrs with LED Display Perfect for Parties Mixed Drinks, Electric Ice Maker 2L with Ice Scoop and Basket', '$129.99', 'https://www.amazon.com/IKICH-Portable-Ice-Maker-Machine/dp/B07Q33HD6X?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71-3uHTzHIL._AC_UL200_SR200,200_.jpg']
['Igloo ICEBNH26BK Automatic Self-Cleaning Portable Electric Countertop Maker Machine, 26 Pounds in 24 Hours, 9 Cubes Ready in 7 Minutes, with Ice Scoop and Basket-Black', '$106.99', 'https://www.amazon.com/Igloo-ICEBNH26BK-Self-Cleaning-Countertop-Basket-Black/dp/B08FKYCYPY?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71H5ZcWuDOL._AC_UL200_SR200,200_.jpg']
['AGLUCKY Ice Maker Machine for Countertop, Portable Ice Cube Makers, Make 26 lbs ice in 24 hrs,Ice Cube Rready in 6-8 Mins with Ice Scoop and Basket for Home/Office/Bar (Black)', '$114.99', 'https://www.amazon.com/AGLUCKY-Machine-Countertop-Portable-Makers/dp/B08FZYMWJT?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/711hYiu-6pL._AC_UL200_SR200,200_.jpg']
['Frigidaire EFIC206-TG-SILVER Compact Ice Maker, 26 lb per Day, Silver', '$109.99', 'https://www.amazon.com/Frigidaire-EFIC206-SILVER-Ice-Maker-Through/dp/B075QP7SCP?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71IXWvUmsGL._AC_UL200_SR200,200_.jpg']
['IKICH Ice Maker Countertop, 26lbs 24Hrs, 9 Cubes Ready in 7mins, Portable Electric Maker with LED Indicator Lights, Ice Scoop and Basket for Home Office Bar Party, Black', '$104.89', 'https://www.amazon.com/IKICH-Countertop-Portable-Electric-Indicator/dp/B0894W324Z?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61VgCja9aML._AC_UL200_SR200,200_.jpg']
['Igloo ICEB26HNAQ Automatic Self-Cleaning Portable Electric Countertop Ice Maker Machine With Handle, 26 Pounds in 24 Hours, 9 Ice Cubes Ready in 7 minutes, With Ice Scoop and Basket, Aqua', '$118.99', 'https://www.amazon.com/Igloo-ICEB26HNAQ-Automatic-Self-Cleaning-Countertop/dp/B07XS7Q54M?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71MI9NBksIL._AC_UL200_SR200,200_.jpg']
['Crownful Ice Maker Machine for Countertop, 9 Ice Cubes Ready in 8-10 Minutes, 26lbs Bullet Ice Cubes in 24H, Electric Ice Maker with Scoop and Basket', '$129.99', 'https://www.amazon.com/Crownful-Machine-Countertop-Minutes-Electric/dp/B082C9ML88?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/714kU%2BPsSbL._AC_UL200_SR200,200_.jpg']
['Euhomy Ice Maker Machine Countertop, 40Lbs/24H Portable Compact Ice Cube Maker, With Ice Scoop & Basket, Perfect for Home/Kitchen/Office/Bar (Sliver)', '$214.99', 'https://www.amazon.com/Machine-Countertop-Portable-Compact-Perfect/dp/B07ZV318K3?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71YziGGQOLL._AC_UL200_SR200,200_.jpg']
['KUPPET Compact Twin Tub Portable Mini Washing Machine 26lbs Capacity, Washer(18lbs)&Spiner(8lbs)/Built-in Drain Pump/Semi-Automatic (White)', '$164.99', 'https://www.amazon.com/KUPPET-Compact-Portable-Capacity-Semi-Automatic/dp/B08P5Q7N77?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61cA6Rb7RDL._AC_UL200_SR200,200_.jpg']
['Broan-NuTone 412101 Non-Ducted Ductless Range Hood with Lights Exhaust Fan for Under Cabinet, 21-Inch, White', '$39.00', 'https://www.amazon.com/Broan-Ventless-Under-Cabinet-Range/dp/B000HZY0HC?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61z%2BWdmGUEL._AC_UL200_SR200,200_.jpg']
['Ice Maker Countertop Portable Ice Makers VPCOK with Ice Spoon and Basket, 26 lbs / 12 kg in 24 Hours, 2 Ice Sizes, 2.2 L, 9 Ice Cubes Per 6-13 Min', '$107.77', 'https://www.amazon.com/VPCOK-Maker-Portable-Machine-Basket/dp/B085T16SJ6?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71wGsWG2qHL._AC_UL200_SR200,200_.jpg']
['hOmeLabs Chill Pill Countertop Ice Maker - Perfect Ice in 8 to 10 Minutes - 26 Pounds Per Day Production To Keep You Iced Out Of Your Mind This Holiday Season', '$129.99', 'https://www.amazon.com/hOmeLabs-Portable-Maker-Machine-Countertop/dp/B071J2LSQS?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/81y-YfuP81L._AC_UL200_SR200,200_.jpg']
['Midea MRC04M3AWW Single Door Chest Freezer, 3.5 Cubic Feet, White', '$228.00', 'https://www.amazon.com/midea-WHS-129C1-Single-Chest-Freezer/dp/B00MVVITWC?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61Z12He7qdL._AC_UL200_SR200,200_.jpg']
['Euhomy Ice Maker Countertop, 26lbs/24H Portable Compact ice maker machine, 9 Ice cubes ready in 8 Mins, with Ice Scoop & Basket, Perfect for Home/Kitchen/Office/Bar (Grey)', '$114.99', 'https://www.amazon.com/Countertop-Portable-Compact-machine-Perfect/dp/B0887ZKG6F?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71nKiuh4STL._AC_UL200_SR200,200_.jpg']
['Igloo ICEB26AQ Automatic Portable Electric Countertop Ice Maker Machine, 26 Pounds in 24 Hours, 9 Ice Cubes Ready in 7 Minutes, With Ice Scoop and Basket, Perfect for Water Bottles, Mixed Drinks', '$118.99', 'https://www.amazon.com/Igloo-ICEB26AQ-26-Pound-Automatic-Countertop/dp/B07GY4DS42?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71VnpdfklqL._AC_UL200_SR200,200_.jpg']
['Euhomy Mini Freezer Countertop, 1.1 Cubic Feet, Single Door Compact Upright Freezer with Reversible Door, Removable Shelves, Small freezer for Home/Dorms/Apartment/Office(Black)', '$154.99', 'https://www.amazon.com/Countertop-Reversible-Adjustable-Stainless-Apartment/dp/B082PG6HHH?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61EFPu4CTwL._AC_UL200_SR200,200_.jpg']
['ULIT Ice Maker Countertop, Makes 26 lbs. Ice in 24 Hours,9 Ice Cubes Ready in 8 Minutes, Countertop Ice Maker machine with Ice Scoop and Basket, fit for home party, 1.6 lbs. Ice Storage (Black)', '$109.99', 'https://www.amazon.com/ULIT-Countertop-Minutes-machine-Storage/dp/B08ML7D3F8?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61--36mx0nL._AC_UL200_SR200,200_.jpg']
['Euhomy Mini Fridge with Freezer, 3.2 Cu.Ft Compact Refrigerator with freezer, 2 Door Mini Fridge with freezer, Upright for Dorm, Bedroom, Office, Apartment- Food Storage or Drink Beer, Black', '$189.99', 'https://www.amazon.com/Freezer-Compact-Refrigerator-freezer-Apartment/dp/B087M8CKZ1?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61bvmumoqmL._AC_UL200_SR200,200_.jpg']
['Whynter CUF-110B Energy Star 1.1 Cubic Feet Upright Lock, Black Freezer', '$165.50', 'https://www.amazon.com/Whynter-CUF-110B-Energy-Upright-Freezer/dp/B00M7GMEYK?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/7135JGlTxEL._AC_UL200_SR200,200_.jpg']
['Fluidmaster 12IM60 Ice Maker Connector, Braided Stainless Steel - 1/4 Compression Thread x 1/4 Compression Thread, 5 Ft. (60-Inch) Length', '$8.78', 'https://www.amazon.com/Fluidmaster-12IM60-Connector-Braided-Stainless/dp/B000I18U8A?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61C0PRQ6yqL._AC_UL200_SR200,200_.jpg']
['Kismile Counter top Ice Maker Machine with Self-cleaning, 26LBS/24H Compact Automatic Ice Maker,9 Cubes Ready in 6-8 Minutes,Portable Ice Cube Maker, Perfect for Home/Kitchen/Office/Bar (Black)', '$114.99', 'https://www.amazon.com/Kismile-Counter-Self-cleaning-Automatic-Portable/dp/B08BJ7HNKD?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/51NUQxCErHL._AC_UL200_SR200,200_.jpg']
['Giantex Portable Mini Compact Twin Tub Washing Machine 17.6lbs Washer Spain Spinner Portable Washing Machine, Blue+ White', '$129.99', 'https://www.amazon.com/Giantex-Portable-Compact-Washing-Machine/dp/B01ALBMIEI?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61ydHx1s19L._AC_UL200_SR200,200_.jpg']
['Broan-NuTone 413023 Ductless Range Hood Insert with Light, Exhaust Fan for Under Cabinet, 30-Inch, Black', '$39.00', 'https://www.amazon.com/Broan-413023-Capable-Non-Ducted-Under-Cabinet/dp/B000UW20OM?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61d7eeOSa5L._AC_UL200_SR200,200_.jpg']
['KUPPET Washing Machine, 16.5lbs Compact Twin Tub Wash&Spin Combo for Apartment, Dorms, RVs, Camping and More, White&Brown', '$105.99', 'https://www.amazon.com/Portable-Washing-Machine-KUPPET-Apartment/dp/B081DG1BRW?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/51u%2BU9PyqEL._AC_UL200_SR200,200_.jpg']
['Cosmo 5MU30 30 in. Under Cabinet Range Hood with Ducted / Ductless Convertible Duct, Slim Kitchen Stove Vent with, 3 Speed Exhaust Fan, Reusable Filter and LED Lights in Stainless Steel', '$99.95', 'https://www.amazon.com/Cosmo-Under-Cabinet-Range-Stainless/dp/B074PBLJVY?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61w1j0pDcSL._AC_UL200_SR200,200_.jpg']
['KITCHEN BASICS 101: WR21X10208 White Refrigerator Freezer Basket Replacement for GE and Haier RF-0300-29', '$17.91', 'https://www.amazon.com/Kitchen-Basics-101-Refrigerator-Replacement/dp/B07YZWCBBL?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/513Kc1NE8OL._AC_UL200_SR200,200_.jpg']
['Broan-NuTone F403008 Two-Speed Four-Way Convertible Range Hood, 30-Inch, Almond', '$45.00', 'https://www.amazon.com/Broan-NuTone-F403008-Two-Speed-Four-Way-Convertible/dp/B0009XB4I0?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61tJkW1xH%2BL._AC_UL200_SR200,200_.jpg']
['Portable Washing Machine, TACKLIFE 17.6 lbs Mini Compact Twin Tub Washing Machine, Wash (11lbs) and Spin Combo(6.6 lbs), Timer Control with Soaking Function, For Apartment, Dorm, RV, Camping - DSBP171', '$107.97', 'https://www.amazon.com/Portable-Washing-TACKLIFE-Function-Apartment/dp/B088T5WTVT?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61ZTdEFzNmL._AC_UL200_SR200,200_.jpg']
['Dreamiracle Ice Maker Machine for Countertop, 33 lbs Bullet Ice Cube in 24H, 9 Ice Cubes Ready in 7-10 Minutes, 2.8L Ice Maker Machine with Ice Scoop and Basket', '$176.99', 'https://www.amazon.com/Dreamiracle-Machine-Countertop-Bullet-Minutes/dp/B08FDG42VZ?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71C9KLjYi3L._AC_UL200_SR200,200_.jpg']
['Farberware FDW05ASBWHA Complete Portable Countertop Dishwasher with 5-Liter Built-in Water Tank, 5 Programs, Baby Care, Glass & Fruit Wash-Black/White', '$349.99', 'https://www.amazon.com/Farberware-FDW05ASBWHA-Countertop-Dishwasher-Wash-White/dp/B07VR22832?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71a9VHdSx6L._AC_UL200_SR200,200_.jpg']
['Self-Cleaning Portable Electric Countertop Ice Maker Machine With Handle, 9 Bullet Ice Cubes Ready in 7 minutes, Up to 26lbs in 24hrs WIth Ice Scoop & Basket', '$89.99', 'https://www.amazon.com/Self-Cleaning-Portable-Electric-Countertop-Machine/dp/B08NRPKPNC?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/618Dqdxt%2B%2BL._AC_UL200_SR200,200_.jpg']
['Washing Machine Stand sanyi Multi-functional Movable Adjustable Base Mobile Roller with 4×2 Locking Rubber Swivel Wheels and 4 Strong Feet for Washing Machine, Dryer and Refrigerator (Black)', '$30.99', 'https://www.amazon.com/Washing-sanyi-Multi-functional-Adjustable-Refrigerator/dp/B08HLT3CXM?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61U9RxI-OUL._AC_UL200_SR200,200_.jpg']
['COMFEE’ 1.6 Cu.ft Portable Washing Machine, 11lbs Capacity Fully Automatic Compact Washer with Wheels, 6 Wash Programs Laundry Washer with Drain Pump, Ideal for Apartments, RV, Camping, Magnetic Gray', '$289.00', 'https://www.amazon.com/COMFEE-Portable-Capacity-Automatic-Apartments/dp/B089YSKJY6?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/41hOwz0shmL._AC_UL200_SR200,200_.jpg']
['Igloo ICEB33BK Large-Capacity Automatic Portable Electric Countertop Ice Maker Machine, 33 Pounds in 24 Hours, 9 Ice Cubes Ready in 7 minutes, With Ice Scoop and Basket, Black', '$139.99', 'https://www.amazon.com/Igloo-ICEB33BK-Large-Capacity-Automatic-Countertop/dp/B08224JXXY?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/614alA0lYOL._AC_UL200_SR200,200_.jpg']
['Frigidaire EFIC117-SSBLACK-COM EFIC117-SSBLACK 26 Lbs Portable Compact Maker, Black Stainless Steel Ice Making Machine', '$106.99', 'https://www.amazon.com/Frigidaire-Countertop-Maker-Black-Stainless/dp/B07J1YVSNQ?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71AbR67LOpL._AC_UL200_SR200,200_.jpg']
['Cosmo 63175S 30 in. Wall Mount Range Hood with Ductless Convertible Duct (additional filters needed, not included), Ceiling Chimney-Style Stove Vent, LEDs Light, Permanent Filter, 3 Speed Fan, in Stainless Steel', '$199.99', 'https://www.amazon.com/Cosmo-Controls-Lighting-Permanent-Filters/dp/B01DN00K0S?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/711z82cYQjL._AC_UL200_SR200,200_.jpg']
['Portable Single Tub Washer And Dryer- The Laundry Alternative- Mini Washing Machine- Portable Clothes Washer And Dryer- Travel Washing Machine- Small Washing Machine For Small Clothes', '$79.99', 'https://www.amazon.com/Portable-Laundry-Alternative-Washing-Machine/dp/B089DMYHQW?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61PhpGMD%2B2L._AC_UL200_SR200,200_.jpg']
['Arctic King ARC070S0ARBB 7 cu ft Chest Freezer, Black', '$282.00', 'https://www.amazon.com/Arctic-King-ARC070S0ARBB-Chest-Freezer/dp/B084B2XB7G?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/713b0Bu42LL._AC_UL200_SR200,200_.jpg']
['Northair Chest Freezer - 3.5 Cu Ft with 2 Removable Baskets - Reach In Freezer Chest - Quiet Compact Freezer - 7 Temperature Settings - Black', '$249.00', 'https://www.amazon.com/Northair-Free-Standing-Removable-Temperature-Power-Saving/dp/B085TDSPMB?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/6126AyJrsaL._AC_UL200_SR200,200_.jpg']
['ICE2 F2WC9I1 Ice Maker Water Filter (2pack)', '$99.90', 'https://www.amazon.com/F2WC9I1-Maker-Water-Filter-2pack/dp/B08PZDK1B9?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61dGcc1k1LL._AC_UL200_SR200,200_.jpg']
['Broan-NuTone SP3004 Reversible Stainless Steel Backsplash Range Hood Wall Shield for Kitchen, 24 by 30-Inch', '$39.10', 'https://www.amazon.com/Broan-SP3004-Backsplash-30-Inch-Stainless/dp/B000AMOBA8?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/71ny-BjstuL._AC_UL200_SR200,200_.jpg']
['Frigidaire FFEC3025UB 30 Inch Electric Smoothtop Style Cooktop with 4 Elements in Black', '$439.00', 'https://www.amazon.com/Frigidaire-FFEC3025UB-Electric-Smoothtop-Elements/dp/B07M7FQ1CP?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/41q-Td1J0nL._AC_UL200_SR200,200_.jpg']
['Essential Values Ice Machine Cleaner 16 fl oz, Nickel Safe Descaler | Ice Maker Cleaner Compatible with ALL Major Brands - Made in USA', '$10.97', 'https://www.amazon.com/Essential-Values-Universal-Application-Whirlpool/dp/B01LZK0UG3?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/51UnKyfVTML._AC_UL200_SR200,200_.jpg']
["COMFEE' Portable Washing Machine, 0.9 cu.ft Compact Washer With LED Display, 5 Wash Cycles, 2 Built-in Rollers, Space Saving Full-Automatic Washer, Ideal Laundry for RV, Dorm, Apartment, Ivory White", '$219.00', 'https://www.amazon.com/COMFEE-Portable-Washing-Full-Automatic-Apartment/dp/B08NX4BVM9?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61ckarCl0aL._AC_UL200_SR200,200_.jpg']
['Panda 110V Electric Portable Compact Laundry Clothes Dryer, 1.5 cu.ft, Stainless Steel Drum Black and White, PAN725SF', '$229.99', 'https://www.amazon.com/Panda-cu-ft-Compact-Laundry-Dryer/dp/B00EAY540S?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/61coJeDGFvL._AC_UL200_SR200,200_.jpg']
['Frigidaire EFIC103 Ice Maker Machine Heavy Duty, Large Stainless Steel', '$129.38', 'https://www.amazon.com/Frigidaire-EFIC103-Machine-Icemaker-Stainless/dp/B004VV8GOQ?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/710vHAh%2B2sL._AC_UL200_SR200,200_.jpg']
['Portable Washing Machine, Kuppet 10lbs Compact Mini Washer, Wash&Spin Twin Tub Durable Design to Wash All your Laundry or Swim Suit for Apartments, Dorms, RV Camping (Blue)', '$95.99', 'https://www.amazon.com/Portable-Washing-Machine-Compact-Apartments/dp/B07SM6CWSL?_encoding=UTF8&psc=1', 'https://images-na.ssl-images-amazon.com/images/I/616AlKmVojL._AC_UL200_SR200,200_.jpg']

Process finished with exit code 0

 

전체 파이썬 스크립트

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup


def get_link_list(list):
    count = 0
    rtn_list = {}
    for item in list:
        # print(item.find('a', href=True).attrs['href'])
        link = item.find('a', href=True).attrs['href']
        count = count + 1
        rtn_list[count] = link

    return rtn_list


def get_category_numbering(list):
    rtn_list = []
    count = 1
    rtn_list.append("### 아마존 베스트 셀러 카테고리(Amazon Best Sellers)nn")
    for item in list:
        rtn_list.append(str(count) + "." + item.get_text())
        count += 1

    return rtn_list


url = 'https://www.amazon.com/bestsellers'
request = urllib.request.urlopen(url)
html = request.read()
beautifulSoup = BeautifulSoup(html, 'html.parser')

find_ul = beautifulSoup.find('ul', id='zg_browseRoot').find('ul')
li_list = find_ul.find_all('li')

# print(li_list[0])
# print(type(li_list))  # ResultSet

# 링크
links_list = get_link_list(li_list)
# print(len(links_list))
#print(links_list.items())

# 카테고리 리스트 번호 매김
select_list = get_category_numbering(li_list)


# 1줄에 5개씩 분리
category_department = []
cnt = 1
for item in select_list:
    category_department.append(item + 'tt')

    if (cnt % 5) == 0:
        category_department.append('n')

    cnt += 1

category_department.append('nn' + '카테고리를 선택하세요: ')
# print(str(''.join(category_department)))
# input_val = input(str(''.join(category_department)))

input_val = input(str(''.join(category_department)))  # type str

print(f'{input_val}번이 선택되었습니다.')
target_link = links_list.get(int(input_val))
print(f'대상 링크:{target_link}')
request.close()

request = urllib.request.urlopen(target_link)
html = request.read()
beautifulSoup = BeautifulSoup(html, 'html.parser')

find_ol = beautifulSoup.find('ol', id='zg-ordered-list')
li_list = find_ol.find_all('li')

got_item_list = []

for item in li_list:
    # print(item.find('a', href=True).attrs['href'])
    link = "https://www.amazon.com" + (item.find('a', href=True).attrs['href'])
    title = item.find('a', href=True).text.replace("nn            ", "").replace("n        n", "")
    price = item.find('span', attrs={'class': 'p13n-sc-price'}).text
    img_url = item.find('div', attrs={'class': 'a-section'}).img['src']
    #print(title)
    got_item_list.append([title, price, link, img_url])
    #print(item.find("span", attrs={'class': 'aok-inline-block'}).text)
    #print(item.find("a", attrs={'class': 'a-link-normal'}).attrs['href'])
    #https: // www.amazon.com /


print("총 개수 : ", len(got_item_list))
print(got_item_list[0])
print('-'*50)
for item in got_item_list:
    print(item)

 

가져온 자료를 엑셀파일로 저장하거나, csv 파일로 저장하면 되겠죠? json 타입으로 데이터를 파싱 처리하셔도 되고, 데이터베이스를 활용하여 데이터베이스에 저장해도 되겠지요. sqlite가 가장 접근하기 쉬울거에요. 

엑셀파일로 저장하는 방법은 아래 글을 확인하세요.

 

[Python] 파이썬 판다스(pandas)를 사용하여 엑셀(xlsx, csv)파일로 저장하는 방법 : numpy, openpyxl, to_excel(

판다스(pandas)는 데이터 분석을 위해 많이 사용되는 모듈입니다. xlsx, csv파일을 읽어와서 DataFrame으로 가져올 수 있습니다. 또다른 방법은 웹 크롤링을 하여 가져올 수 있습니다. 판다스(pandas)를

playground.naragara.com

상품리스트에 페이지가 여러개 있습니다. url를 계속 변경해가면서 처리하셔도 되지만, 브라우저의 동작을 눈으로 보고 싶으시다면 셀레늄(selenium)을 사용하여 자동화하는 방법도 추천드립니다.

 

[REFERENCE]

stackoverflow.com/questions/43814754/python-beautifulsoup-how-to-get-href-attribute-of-a-element

stackoverflow.com/questions/52332361/how-to-use-beautifulsoup-to-find-an-href-link-with-a-class

playground.naragara.com/668

 

[Python] 파이썬 웹 크롤링 BeautifulSoup모듈을 사용하여 뉴스 긁어오기: HTML파싱(뉴스 제목, 날짜, 링

뉴스타파 사이트의 “세금도둑추적2020” 뉴스 크롤링을 시도해봅니다. 뉴스부분의 HTML을 파싱하기위해서 크롬 브라우저를 열고 newstapa.org 사이트를 열어요. 그리고 난 후 F12키를 눌러 개발자 도구

playground.naragara.com

 

[파이썬 더 알아보기]

[프로그래밍/Python] – [Python] 파이썬 BeautifulSoup 설치 오류시 해결 방법 : os.system(), pip install beautifulsoup4

[프로그래밍/Python] – [Python] 파이썬 selenium WebDriverException오류 해결 : selenium.common.exceptions.WebDriverException: Message: ‘chromedriver’ executable needs to be in PATH

[프로그래밍/Python] – [Python] 파이썬 셀레늄(selenium)을 사용하여 네이버, 다음, 구글, 인스타그램, 페이스북 자동 로그인 및 검색 기능 만드는 방법 : 크롬 브라우저 조작(매크로)

[프로그래밍/Python] – [Python] 파이썬 웹 크롤링 BeautifulSoup모듈을 사용하여 뉴스 긁어오기: HTML파싱(뉴스 제목, 날짜, 링크,이미지URL)

Leave a Reply

error: Content is protected !!