Scrapy installation on CentOS and Windows

Scrapy installation on CentOS and  Windows

Scrapy is an application framework supporting development of applications in a given environment. I discuss here the steps of installation of Scrapy both CentOS and Windows environments including installation of the dependencies thereof.

Scrapy Installation on Centos 6.5

Scrapy needs python 2.7 and above to run in CentOS . CentOS 6.5 comes with Python 2.6 .So we need to install python 2.7+ to run Scrapy code. Here are the steps to install Python 2.7.11. Firstly install the Scrapy dependencies, preceding installation of Python 2.7.11.

yum -y update  
yum groupinstall -y 'development tools'
yum install -y zlib-devel bzip2-devel openssl-devel xz-libs wget python-devel libxml2-devel libxslt-devel pyOpenSSL gcc libffi libffi-devel sqlite-devel sqlite

Installing Python 2.7.11

wget https://www.python.org/ftp/python/2.7.11/Python-2.7.11.tgz
tar -xvzf Python-2.7.11.tgz

# Enter the directory:
cd Python-2.7.11

# Run the configure:
./configure --prefix=/usr/local

# compile and install it:
make  
make altinstall

# Checking Python version:
[root@nicetry ~]# python2.7 -V
Python 2.7.11

#Environment variables
Add below line in .bashrc file and run source .bashrc
export PATH="/usr/local/bin:$PATH" 

Install setup tools

 

wget --no-check-certificate https://pypi.python.org/packages/source/s/setuptools/setuptools-1.4.2.tar.gz

# Extract the files:
tar -xvf setuptools-1.4.2.tar.gz  
cd setuptools-1.4.2

# Install setuptools using the Python 2.7.8:
python2.7 setup.py install

Install PIP (a package management system used to install and manage software packages written in Python.)

curl https://bootstrap.pypa.io/get-pip.py | python2.7 -

pip -V
pip 8.1.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
[root@hivestore1-srv1 data-ingestion]#

Install Scrapy

CFLAGS="-O0" pip install lxml
pip install scrapy

Once Scrapy installation is done check “scrapy shell <URL> ” from command line

Scrapy Installation on Windows

Python installation

Install Python 2.7.11 from https://www.python.org/downloads/

Update the PATH ; open  Command prompt and run:

c:\python27\python.exe c:\python27\tools\scripts\win_add2path.py

Close the command prompt window so that changes become effective. Reopen command prompt and run the following command and to check the Python version:

python --version

pywin32 installation
Install pywin32 from http://sourceforge.net/projects/pywin32/

Be sure you download the architecture (win32 or amd64) that matches your system [even our system is 64 bit also we need to download 32 bit only (example: pywin32-219.win32-py2.7.exe)]

PIP Installation
Install pip from https://pip.pypa.io/en/latest/installing.html
Now open a Command prompt to check pip is installed correctly:

pip --version

LXML package installation
Install LXML package from below https://pypi.python.org/packages/2.7/l/lxml/lxml-3.5.0.win32-py2.7.exe#md5=3fb7a9fb71b7d0f53881291614bd323c

Install Visual C++ Compiler
Install “Microsoft Visual C++ Compiler for Python 2.7” from https://www.microsoft.com/en-in/download/details.aspx?id=44266
Scrapy installation
Ensure Python 2.7 and pip package manager working, then install Scrapy:
pip install Scrapy
pip install happybase
pip install requests
pip install OrderedDict
pip install html2text

Add PyDev perspective

Install Eclipse Juno or STS then add PyDev perspective to write python code

Once  Scrapy installation is done check “scrapy shell <URL> ” from command line

Leave a Reply