Lightgbm Python

 
LightGBM
Original author(s)Guolin Ke[1] / Microsoft Research
Developer(s)Microsoft and LightGBM Contributors[2]
Initial release2016; 5 years ago
Stable release
Repositorygithub.com/microsoft/LightGBM
Written inC++, Python, R, C
Operating systemWindows, macOS, Linux
TypeMachine learning, Gradient boosting framework
LicenseMIT License
Websitelightgbm.readthedocs.io

House Price Regression with LightGBM Python notebook using data from House Prices - Advanced Regression Techniques 21,185 views 2y ago. LightGBM Warning featurefraction is set=0.9, colsamplebytree=1.0 will be ignored. Current value: featurefraction=0.9 LightGBM Warning baggingfraction is set=0.7, subsample=1.0 will be. The folder name is python-package (without the s) (in LightGBM version 2.1.2) – Anand Jul 21 '18 at 18:16. Add a comment 39. All the above answers didn't work for me. On Mac, if I installed the libomp using brew fixed the problem: Refer: link. Brew install libomp Share. LightGBM is a fast, distributed, high performance gradient boosting framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It has helped Kagglers win data science competitions. So, let's get started. Simple Python LightGBM example Python script using data from Porto Seguro’s Safe Driver Prediction 63,676 views 4y ago gradient boosting, categorical data 73.

LightGBM算法1 算法原理2 代码实践1 算法原理2 代码实践 机器学习(七)——LightGBM算法 快乐星球小怪兽 2021-04-28 11:16:27 5 收藏.

LightGBM, short for Light Gradient Boosting Machine, is a free and open source distributed gradient boosting framework for machine learning originally developed by Microsoft.[4][5] It is based on decision tree algorithms and used for ranking, classification and other machine learning tasks. The development focus is on performance and scalability.

Overview[edit]

The LightGBM framework supports different algorithms including GBT, GBDT, GBRT, GBM, MART[6][7] and RF.[8] LightGBM has many of XGBoost's advantages, including sparse optimization, parallel training, multiple loss functions, regularization, bagging, and early stopping. A major difference between the two lies in the construction of trees. LightGBM does not grow a tree level-wise — row by row — as most other implementations do.[9] Instead it grows trees leaf-wise. It chooses the leaf it believes will yield the largest decrease in loss.[10] Besides, LightGBM does not use the widely-used sorted-based decision tree learning algorithm, which searches the best split point on sorted feature values,[11] as XGBoost or other implementations do. Instead, LightGBM implements a highly optimized histogram-based decision tree learning algorithm, which yields great advantages on both efficiency and memory consumption. [12] The LightGBM algorithm utilizes two novel techniques called Gradient-Based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) which allow the algorithm to run faster while maintaining a high level of accuracy.[13]

LightGBM works on Linux, Windows, and macOS and supports C++, Python,[14]R, and C#.[15] The source code is licensed under MIT License and available on GitHub.[16]

Gradient-Based One-Side Sampling[edit]

Gradient-Based One-Side Sampling (GOSS) is a method that leverages the fact that there is no native weight for data instance in GBDT. Since data instances with different gradients play different roles in the computation of information gain, the instances with larger gradients will contribute more to the information gain. Thus, in order to retain the accuracy of the information, GOSS keeps the instances with large gradients and randomly drops the instances with small gradients.[13]

Exclusive Feature Bundling[edit]

Exclusive Feature Bundling (EFB) is a near-lossess method to reduce the number of effective features. In a sparse feature space many features are nearly exclusive, implying they rarely take nonzero values simultaneously. One-hot encoded features are a perfect example of exclusive features. EFB bundles these features, reducing dimensionality to improve efficiency while maintaining a high level of accuracy. The bundle of exclusive features into a single feature is called an exclusive feature bundle. [13]

See also[edit]

References[edit]

  1. ^'Guolin Ke'.
  2. ^'microsoft/LightGBM'. GitHub.
  3. ^'Releases · microsoft/LightGBM'. GitHub.
  4. ^Brownlee, Jason (March 31, 2020). 'Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and CatBoost'.
  5. ^Kopitar, Leon; Kocbek, Primoz; Cilar, Leona; Sheikh, Aziz; Stiglic, Gregor (July 20, 2020). 'Early detection of type 2 diabetes mellitus using machine learning-based prediction models'. Scientific Reports. 10 (1): 11981. Bibcode:2020NatSR..1011981K. doi:10.1038/s41598-020-68771-z. PMC7371679. PMID32686721 – via www.nature.com.
  6. ^'Understanding LightGBM Parameters (and How to Tune Them)'. neptune.ai. May 6, 2020.
  7. ^'An Overview of LightGBM'. avanwyk. May 16, 2018.
  8. ^'Parameters — LightGBM 3.0.0.99 documentation'. lightgbm.readthedocs.io.
  9. ^The Gradient Boosters IV: LightGBM – Deep & Shallow
  10. ^XGBoost, LightGBM, and Other Kaggle Competition Favorites by Andre Ye Sep, 2020 Towards Data Science
  11. ^Manish, Mehta; Rakesh, Agrawal; Jorma, Rissanen (Nov 24, 2020). 'SLIQ: A fast scalable classifier for data mining'. International Conference on Extending Database Technology. CiteSeerX10.1.1.89.7734.
  12. ^'Features — LightGBM 3.1.0.99 documentation'. lightgbm.readthedocs.io.
  13. ^ abcKe, Guolin; Meng, Qi; Finley, Thomas; Wang, Taifeng; Chen, Wei; Ma, Weidong; Ye, Qiwei; Liu, Tie-Yan (2017). 'LightGBM: A Highly Efficient Gradient Boosting Decision Tree'. Advances in Neural Information Processing Systems. 30.
  14. ^'lightgbm: LightGBM Python Package' – via PyPI.
  15. ^'Microsoft.ML.Trainers.LightGbm Namespace'. docs.microsoft.com.
  16. ^'microsoft/LightGBM'. October 6, 2020 – via GitHub.

Further reading[edit]

  • Guolin Ke, Qi Meng, Thomas Finely, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu (2017). 'LightGBM: A Highly Efficient Gradient Boosting Decision Tree'(PDF).Cite journal requires journal= (help)CS1 maint: uses authors parameter (link)
  • Quinto, Butch (2020). Next-Generation Machine Learning with Spark – Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More. Apress. ISBN978-1-4842-5668-8.

External links[edit]

Retrieved from 'https://en.wikipedia.org/w/index.php?title=LightGBM&oldid=1017435995'

This is a guide for building the LightGBM Command Line Interface (CLI). If you want to build the Python-package or R-package please refer to Python-package and R-package folders respectively.

All instructions below are aimed at compiling the 64-bit version of LightGBM.It is worth compiling the 32-bit version only in very rare special cases involving environmental limitations.The 32-bit version is slow and untested, so use it at your own risk and don’t forget to adjust some of the commands below when installing.

If you need to build a static library instead of a shared one, you can add -DBUILD_STATIC_LIB=ON to CMake flags.

Users who want to perform benchmarking can make LightGBM output time costs for different internal routines by adding -DUSE_TIMETAG=ON to CMake flags.

It is possible to build LightGBM in debug mode. In this mode all compiler optimizations are disabled and LightGBM performs more checks internally. To enable debug mode you can add -DUSE_DEBUG=ON to CMake flags or choose Debug_* configuration (e.g. Debug_DLL, Debug_mpi) in Visual Studio depending on how you are building LightGBM.

Lightgbm

You can also download the artifacts of the latest successful build on master branch (nightly builds) here: .

Contents

Windows¶

On Windows LightGBM can be built using

  • Visual Studio;

  • CMake and VS Build Tools;

  • CMake and MinGW.

Visual Studio (or VS Build Tools)¶

With GUI¶

  1. Install Visual Studio (2015 or newer).

  2. Download zip archive and unzip it.

  3. Go to LightGBM-master/windows folder.

  4. Open LightGBM.sln file with Visual Studio, choose Release configuration and click BUILD -> BuildSolution(Ctrl+Shift+B).

    If you have errors about Platform Toolset, go to PROJECT -> Properties -> ConfigurationProperties -> General and select the toolset installed on your machine.

Lightgbm python

The .exe file will be in LightGBM-master/windows/x64/Release folder.

From Command Line¶

  1. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is already installed).

  2. Run the following commands:

The .exe and .dll files will be in LightGBM/Release folder.

MinGW-w64¶

  1. Install Git for Windows, CMake and MinGW-w64.

  2. Run the following commands:

The .exe and .dll files will be in LightGBM/ folder.

Note: You may need to run the cmake-G'MinGWMakefiles'.. one more time if you encounter the sh.exewasfoundinyourPATH error.

It is recommended that you use Visual Studio since it has better multithreading efficiency in Windows for many-core systems(see Question 4 and Question 8).

Also, you may want to read gcc Tips.

Linux¶

On Linux LightGBM can be built using CMake and gcc or Clang.

  1. Install CMake.

  2. Run the following commands:

Note: glibc >= 2.14 is required.

Also, you may want to read gcc Tips.

macOS¶

On macOS LightGBM can be installed using Homebrew, or can be built using CMake and Apple Clang or gcc.

Apple Clang¶

Only Apple Clang version 8.1 or higher is supported.

Install Using Homebrew

Lightgbm Python

Build from GitHub¶

  1. Install CMake (3.16 or higher):

  2. Install OpenMP:

  3. Run the following commands:

gcc¶

  1. Install CMake (3.2 or higher):

  2. Install gcc:

  3. Run the following commands:

Lightgbm Python

Also, you may want to read gcc Tips.

Docker¶

Refer to Docker folder.

Build Threadless Version (not Recommended)¶

The default build version of LightGBM is based on OpenMP.You can build LightGBM without OpenMP support but it is strongly not recommended.

Windows¶

On Windows a version of LightGBM without OpenMP support can be built using

Lightgbm
  • Visual Studio;

  • CMake and VS Build Tools;

  • CMake and MinGW.

Visual Studio (or VS Build Tools)¶

With GUI¶
  1. Install Visual Studio (2015 or newer).

  2. Download zip archive and unzip it.

  3. Go to LightGBM-master/windows folder.

  4. Open LightGBM.sln file with Visual Studio.

  5. Go to PROJECT -> Properties -> ConfigurationProperties -> C/C++ -> Language and change the OpenMPSupport property to No(/openmp-).

  6. Get back to the project’s main screen, then choose Release configuration and click BUILD -> BuildSolution(Ctrl+Shift+B).

    If you have errors about Platform Toolset, go to PROJECT -> Properties -> ConfigurationProperties -> General and select the toolset installed on your machine.

The .exe file will be in LightGBM-master/windows/x64/Release folder.

From Command Line¶
  1. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is already installed).

  2. Run the following commands:

The .exe and .dll files will be in LightGBM/Release folder.

MinGW-w64¶

  1. Install Git for Windows, CMake and MinGW-w64.

  2. Run the following commands:

The .exe and .dll files will be in LightGBM/ folder.

Note: You may need to run the cmake-G'MinGWMakefiles'-DUSE_OPENMP=OFF.. one more time if you encounter the sh.exewasfoundinyourPATH error.

Linux¶

On Linux a version of LightGBM without OpenMP support can be built using CMake and gcc or Clang.

  1. Install CMake.

  2. Run the following commands:

Note: glibc >= 2.14 is required.

macOS¶

On macOS a version of LightGBM without OpenMP support can be built using CMake and Apple Clang or gcc.

Apple Clang¶

Only Apple Clang version 8.1 or higher is supported.

  1. Install CMake (3.16 or higher):

  2. Run the following commands:

gcc¶

  1. Install CMake (3.2 or higher):

  2. Install gcc:

  3. Run the following commands:

Build MPI Version¶

The default build version of LightGBM is based on socket. LightGBM also supports MPI.MPI is a high performance communication approach with RDMA support.

If you need to run a distributed learning application with high performance communication, you can build the LightGBM with MPI support.

Windows¶

On Windows an MPI version of LightGBM can be built using

  • MS MPI and Visual Studio;

  • MS MPI, CMake and VS Build Tools.

With GUI¶

  1. You need to install MS MPI first. Both msmpisdk.msi and msmpisetup.exe are needed.

  2. Install Visual Studio (2015 or newer).

  3. Download zip archive and unzip it.

  4. Go to LightGBM-master/windows folder.

  5. Open LightGBM.sln file with Visual Studio, choose Release_mpi configuration and click BUILD -> BuildSolution(Ctrl+Shift+B).

    If you have errors about Platform Toolset, go to PROJECT -> Properties -> ConfigurationProperties -> General and select the toolset installed on your machine.

The .exe file will be in LightGBM-master/windows/x64/Release_mpi folder.

From Command Line¶

  1. You need to install MS MPI first. Both msmpisdk.msi and msmpisetup.exe are needed.

  2. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is already installed).

  3. Run the following commands:

The .exe and .dll files will be in LightGBM/Release folder.

Note: Building MPI version by MinGW is not supported due to the miss of MPI library in it.

Linux¶

On Linux an MPI version of LightGBM can be built using Open MPI, CMake and gcc or Clang.

  1. Install Open MPI.

  2. Install CMake.

  3. Run the following commands:

Note: glibc >= 2.14 is required.

macOS¶

On macOS an MPI version of LightGBM can be built using Open MPI, CMake and Apple Clang or gcc.

Apple Clang¶

Only Apple Clang version 8.1 or higher is supported.

  1. Install CMake (3.16 or higher):

  2. Install OpenMP:

  3. Install Open MPI:

  4. Run the following commands:

gcc¶

  1. Install CMake (3.2 or higher):

  2. Install gcc:

  3. Install Open MPI:

  4. Run the following commands:

Build GPU Version¶

Linux¶

On Linux a GPU version of LightGBM (device_type=gpu) can be built using OpenCL, Boost, CMake and gcc or Clang.

The following dependencies should be installed before compilation:

  • OpenCL 1.2 headers and libraries, which is usually provided by GPU manufacture.

    The generic OpenCL ICD packages (for example, Debian package ocl-icd-libopencl1 and ocl-icd-opencl-dev) can also be used.

  • libboost 1.56 or later (1.61 or later is recommended).

    We use Boost.Compute as the interface to GPU, which is part of the Boost library since version 1.61. However, since we include the source code of Boost.Compute as a submodule, we only require the host has Boost 1.56 or later installed. We also use Boost.Align for memory allocation. Boost.Compute requires Boost.System and Boost.Filesystem to store offline kernel cache.

    The following Debian packages should provide necessary Boost libraries: libboost-dev, libboost-system-dev, libboost-filesystem-dev.

  • CMake 3.2 or later.

To build LightGBM GPU version, run the following commands:

Windows¶

On Windows a GPU version of LightGBM (device_type=gpu) can be built using OpenCL, Boost, CMake and VS Build Tools or MinGW.

If you use MinGW, the build procedure is similar to the build on Linux. Refer to GPU Windows Compilation to get more details.

Following procedure is for the MSVC (Microsoft Visual C++) build.

  1. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is installed).

  2. Install OpenCL for Windows. The installation depends on the brand (NVIDIA, AMD, Intel) of your GPU card.

    • For running on Intel, get Intel SDK for OpenCL.

    • For running on AMD, get AMD APP SDK.

    • For running on NVIDIA, get CUDA Toolkit.

    Further reading and correspondence table: GPU SDK Correspondence and Device Targeting Table.

  3. Install Boost Binaries.

    Note: Match your Visual C++ version:

    Visual Studio 2015 -> msvc-14.0-64.exe,

    Visual Studio 2017 -> msvc-14.1-64.exe,

    Visual Studio 2019 -> msvc-14.2-64.exe.

  4. Run the following commands:

    Note: C:/local/boost_1_63_0 and C:/local/boost_1_63_0/lib64-msvc-14.0 are locations of your Boost binaries (assuming you’ve downloaded 1.63.0 version for Visual Studio 2015).

Docker¶

Refer to GPU Docker folder.

Build CUDA Version (Experimental)¶

The original GPU build of LightGBM (device_type=gpu) is based on OpenCL.

The CUDA-based build (device_type=cuda) is a separate implementation and requires an NVIDIA graphics card with compute capability 6.0 and higher. It should be considered experimental, and we suggest using it only when it is impossible to use OpenCL version (for example, on IBM POWER microprocessors).

Note: only Linux is supported, other operating systems are not supported yet.

Linux¶

On Linux a CUDA version of LightGBM can be built using CUDA, CMake and gcc or Clang.

The following dependencies should be installed before compilation:

  • CUDA 9.0 or later libraries. Please refer to this detailed guide. Pay great attention to the minimum required versions of host compilers listed in the table from that guide and use only recommended versions of compilers.

  • CMake 3.16 or later.

To build LightGBM CUDA version, run the following commands:

Build HDFS Version¶

The HDFS version of LightGBM was tested on CDH-5.14.4 cluster.

Linux¶

On Linux a HDFS version of LightGBM can be built using CMake and gcc.

  1. Install CMake.

  2. Run the following commands:

Build Java Wrapper¶

Lightgbm Python Sklearn

Using the following instructions you can generate a JAR file containing the LightGBM C API wrapped by SWIG.

Windows¶

On Windows a Java wrapper of LightGBM can be built using Java, SWIG, CMake and VS Build Tools or MinGW.

VS Build Tools¶

  1. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is already installed).

  2. Install SWIG and Java (also make sure that JAVA_HOME is set properly).

  3. Run the following commands:

The .jar file will be in LightGBM/build folder and the .dll files will be in LightGBM/Release folder.

MinGW-w64¶

  1. Install Git for Windows, CMake and MinGW-w64.

  2. Install SWIG and Java (also make sure that JAVA_HOME is set properly).

  3. Run the following commands:

The .jar file will be in LightGBM/build folder and the .dll files will be in LightGBM/ folder.

Note: You may need to run the cmake-G'MinGWMakefiles'-DUSE_SWIG=ON.. one more time if you encounter the sh.exewasfoundinyourPATH error.

It is recommended to use VS Build Tools (Visual Studio) since it has better multithreading efficiency in Windows for many-core systems(see Question 4 and Question 8).

Lightgbm Python Example

Also, you may want to read gcc Tips.

Linux¶

On Linux a Java wrapper of LightGBM can be built using Java, SWIG, CMake and gcc or Clang.

  1. Install CMake, SWIG and Java (also make sure that JAVA_HOME is set properly).

  2. Run the following commands:

macOS¶

On macOS a Java wrapper of LightGBM can be built using Java, SWIG, CMake and Apple Clang or gcc.

First, install SWIG and Java (also make sure that JAVA_HOME is set properly).Then, either follow the Apple Clang or gcc installation instructions below.

Apple Clang¶

Only Apple Clang version 8.1 or higher is supported.

  1. Install CMake (3.16 or higher):

  2. Install OpenMP:

  3. Run the following commands:

gcc¶

  1. Install CMake (3.2 or higher):

  2. Install gcc:

  3. Run the following commands:

Also, you may want to read gcc Tips.

Build C++ Unit Tests¶

Windows¶

On Windows, C++ unit tests of LightGBM can be built using CMake and VS Build Tools.

Lightgbm Python Example

  1. Install Git for Windows, CMake (3.8 or higher) and VS Build Tools (VS Build Tools is not needed if Visual Studio (2015 or newer) is already installed).

  2. Run the following commands:

The .exe file will be in LightGBM/Debug folder.

Linux¶

On Linux a C++ unit tests of LightGBM can be built using CMake and gcc or Clang.

  1. Install CMake.

  2. Run the following commands:

Note: glibc >= 2.14 is required.

macOS¶

On macOS a C++ unit tests of LightGBM can be built using CMake and Apple Clang or gcc.

Apple Clang¶

Lightgbm Python Install

Only Apple Clang version 8.1 or higher is supported.

  1. Install CMake (3.16 or higher):

  2. Run the following commands:

Lightgbm In Python

gcc¶

Lightgbm Python Gpu

  1. Install CMake (3.2 or higher):

  2. Install gcc:

  3. Run the following commands: