• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

dev/H22-Nov-2021-218131

xgboost4j/H22-Nov-2021-8,7545,101

xgboost4j-example/H22-Nov-2021-1,9641,059

xgboost4j-flink/H22-Nov-2021-227144

xgboost4j-gpu/H22-Nov-2021-1,136851

xgboost4j-spark/H22-Nov-2021-7,1325,096

xgboost4j-spark-gpu/H22-Nov-2021-4948

xgboost4j-tester/H22-Nov-2021-274249

.gitignoreH A D22-Nov-202120 32

README.mdH A D22-Nov-20214.6 KiB137100

checkstyle-suppressions.xmlH A D22-Nov-20211.2 KiB336

checkstyle.xmlH A D22-Nov-20216.9 KiB163111

create_jni.pyH A D22-Nov-20215.2 KiB167129

pom.xmlH A D22-Nov-202121.7 KiB548536

scalastyle-config.xmlH A D22-Nov-202113.3 KiB277158

README.md

1# XGBoost4J: Distributed XGBoost for Scala/Java
2[![Build Status](https://travis-ci.org/dmlc/xgboost.svg?branch=master)](https://travis-ci.org/dmlc/xgboost)
3[![Documentation Status](https://readthedocs.org/projects/xgboost/badge/?version=latest)](https://xgboost.readthedocs.org/en/latest/jvm/index.html)
4[![GitHub license](http://dmlc.github.io/img/apache2.svg)](../LICENSE)
5
6[Documentation](https://xgboost.readthedocs.org/en/latest/jvm/index.html) |
7[Resources](../demo/README.md) |
8[Release Notes](../NEWS.md)
9
10XGBoost4J is the JVM package of xgboost. It brings all the optimizations
11and power xgboost into JVM ecosystem.
12
13- Train XGBoost models in scala and java with easy customizations.
14- Run distributed xgboost natively on jvm frameworks such as
15Apache Flink and Apache Spark.
16
17You can find more about XGBoost on [Documentation](https://xgboost.readthedocs.org/en/latest/jvm/index.html) and [Resource Page](../demo/README.md).
18
19## Add Maven Dependency
20
21XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5.
22
23### Access release version
24
25<b>Maven</b>
26
27```
28<dependency>
29    <groupId>ml.dmlc</groupId>
30    <artifactId>xgboost4j_2.12</artifactId>
31    <version>latest_version_num</version>
32</dependency>
33<dependency>
34    <groupId>ml.dmlc</groupId>
35    <artifactId>xgboost4j-spark_2.12</artifactId>
36    <version>latest_version_num</version>
37</dependency>
38```
39
40<b>sbt</b>
41```sbt
42libraryDependencies ++= Seq(
43  "ml.dmlc" %% "xgboost4j" % "latest_version_num",
44  "ml.dmlc" %% "xgboost4j-spark" % "latest_version_num"
45)
46```
47
48For the latest release version number, please check [here](https://github.com/dmlc/xgboost/releases).
49
50To enable the GPU algorithm (`tree_method='gpu_hist'`), use artifacts `xgboost4j-gpu_2.12` and `xgboost4j-spark-gpu_2.12` instead.
51
52### Access SNAPSHOT version
53
54First add the following Maven repository hosted by the XGBoost project:
55
56<b>Maven</b>:
57
58```xml
59<repository>
60  <id>XGBoost4J Snapshot Repo</id>
61  <name>XGBoost4J Snapshot Repo</name>
62  <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/snapshot/</url>
63</repository>
64```
65
66<b>sbt</b>:
67
68```sbt
69resolvers += "XGBoost4J Snapshot Repo" at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/snapshot/"
70```
71
72Then add XGBoost4J as a dependency:
73
74<b>Maven</b>
75
76```
77<dependency>
78    <groupId>ml.dmlc</groupId>
79    <artifactId>xgboost4j_2.12</artifactId>
80    <version>latest_version_num-SNAPSHOT</version>
81</dependency>
82<dependency>
83    <groupId>ml.dmlc</groupId>
84    <artifactId>xgboost4j-spark_2.12</artifactId>
85    <version>latest_version_num-SNAPSHOT</version>
86</dependency>
87```
88
89<b>sbt</b>
90```sbt
91libraryDependencies ++= Seq(
92  "ml.dmlc" %% "xgboost4j" % "latest_version_num-SNAPSHOT",
93  "ml.dmlc" %% "xgboost4j-spark" % "latest_version_num-SNAPSHOT"
94)
95```
96
97For the latest release version number, please check [the repository listing](https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html).
98
99To enable the GPU algorithm (`tree_method='gpu_hist'`), use artifacts `xgboost4j-gpu_2.12` and `xgboost4j-spark-gpu_2.12` instead.
100
101## Examples
102
103Full code examples for Scala, Java, Apache Spark, and Apache Flink can
104be found in the [examples package](https://github.com/dmlc/xgboost/tree/master/jvm-packages/xgboost4j-example).
105
106**NOTE on LIBSVM Format**:
107
108There is an inconsistent issue between XGBoost4J-Spark and other language bindings of XGBoost.
109
110When users use Spark to load trainingset/testset in LIBSVM format with the following code snippet:
111
112```scala
113spark.read.format("libsvm").load("trainingset_libsvm")
114```
115
116Spark assumes that the dataset is 1-based indexed. However, when you do prediction with other bindings of XGBoost (e.g. Python API of XGBoost), XGBoost assumes that the dataset is 0-based indexed. It creates a pitfall for the users who train model with Spark but predict with the dataset in the same format in other bindings of XGBoost.
117
118## Development
119
120You can build/package xgboost4j locally with the following steps:
121
122**Linux:**
1231. Ensure [Docker for Linux](https://docs.docker.com/install/) is installed.
1242. Clone this repo: `git clone --recursive https://github.com/dmlc/xgboost.git`
1253. Run the following command:
126  - With Tests: `./xgboost/jvm-packages/dev/build-linux.sh`
127  - Skip Tests: `./xgboost/jvm-packages/dev/build-linux.sh --skip-tests`
128
129**Windows:**
1301. Ensure [Docker for Windows](https://docs.docker.com/docker-for-windows/install/) is installed.
1312. Clone this repo: `git clone --recursive https://github.com/dmlc/xgboost.git`
1323. Run the following command:
133  - With Tests: `.\xgboost\jvm-packages\dev\build-linux.cmd`
134  - Skip Tests: `.\xgboost\jvm-packages\dev\build-linux.cmd --skip-tests`
135
136*Note: this will create jars for deployment on Linux machines.*
137