0% found this document useful (0 votes)
3K views4 pages

Scikit-learn Release Notes July 2017

Scikit-learn is a popular machine learning library for Python. It provides simple and efficient tools for data mining and data analysis. Scikit-learn contains algorithms for clustering, classification, and regression. It integrates well with NumPy, SciPy, and other Python scientific libraries. Scikit-learn was originally developed in 2007 and has grown significantly, with over 1.3 million downloads per month as of 2023.

Uploaded by

levin696
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3K views4 pages

Scikit-learn Release Notes July 2017

Scikit-learn is a popular machine learning library for Python. It provides simple and efficient tools for data mining and data analysis. Scikit-learn contains algorithms for clustering, classification, and regression. It integrates well with NumPy, SciPy, and other Python scientific libraries. Scikit-learn was originally developed in 2007 and has grown significantly, with over 1.3 million downloads per month as of 2023.

Uploaded by

levin696
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
  • Overview and Version History
  • scikit-learn Tools and References
  • References Continued
  • External Links

scikit-learn

scikit-learn (formerly [Link] and also known as sklearn) is


a free software machine learning library for the Python
scikit-learn
programming language.[3] It features various classification,
regression and clustering algorithms including support-vector
machines, random forests, gradient boosting, k-means and
DBSCAN, and is designed to interoperate with the Python
Original author(s) David
numerical and scientific libraries NumPy and SciPy. Scikit-learn is
Cournapeau
a NumFOCUS fiscally sponsored project.[4]
Initial release June 2007

Overview Stable release 1.3.0[1] / 30


June 2023
The scikit-learn project started as [Link], a Google Summer Repository [Link]
of Code project by French data scientist David Cournapeau. The /scikit-learn
name of the project stems from the notion that it is a "SciKit"
/scikit-learn (htt
(SciPy Toolkit), a separately developed and distributed third-party
ps://[Link]
extension to SciPy.[5] The original codebase was later rewritten by
m/scikit-learn/s
other developers. In 2010, contributors Fabian Pedregosa, Gaël
Varoquaux, Alexandre Gramfort and Vincent Michel, from the cikit-learn)
French Institute for Research in Computer Science and Written in Python,
Automation in Saclay, France, took leadership of the project and Cython, C and
released the first public version of the library on February 1, C++[2]
2010.[6] In November 2012, scikit-learn as well as scikit-image,
were described as two of the "well-maintained and popular" scikits Operating system Linux, macOS,
libraries.[7] In 2019, it was noted that scikit-learn is one of the most Windows
popular machine learning libraries on GitHub.[8] Type Library for
machine
Implementation learning
License New BSD
scikit-learn is largely written in Python, and uses NumPy License
extensively for high-performance linear algebra and array Website [Link]
operations. Furthermore, some core algorithms are written in
([Link]
Cython to improve performance. Support vector machines are
[Link]/)
implemented by a Cython wrapper around LIBSVM; logistic
regression and linear support vector machines by a similar wrapper
around LIBLINEAR. In such cases, extending these methods with Python may not be possible.

scikit-learn integrates well with many other Python libraries, such as Matplotlib and plotly for plotting,
NumPy for array vectorization, Pandas dataframes, SciPy, and many more.

Version history
scikit-learn was initially developed by David Cournapeau as a Google Summer of Code project in 2007.
Later that year, Matthieu Brucher joined the project and started to use it as a part of his thesis work. In
2010, INRIA, the French Institute for Research in Computer Science and Automation, got involved and the
first public release (v0.1 beta) was published in late January 2010.

August 2013. scikit-learn 0.14[9]


July 2014. scikit-learn 0.15.0[9]
March 2015. scikit-learn 0.16.0[9]
November 2015. scikit-learn 0.17.0[9]
September 2016. scikit-learn 0.18.0
July 2017. scikit-learn 0.19.0
September 2018. scikit-learn 0.20.0[10]
May 2019. scikit-learn 0.21.0[11]
December 2019. scikit-learn 0.22[12]
May 2020. scikit-learn 0.23.0[13]
Jan 2021. scikit-learn 0.24[14]
September 2021. scikit-learn 1.0.0[15]
September 2021. scikit-learn 1.0.0[16]
October 2021. scikit-learn 1.0.1[17]
December 2021. scikit-learn 1.0.2[18]
May 2022. scikit-learn 1.1.0[19]
May 2022. scikit-learn 1.1.1[20]
August 2022. scikit-learn 1.1.2[21]
October 2022. scikit-learn 1.1.3[22]
December 2022. scikit-learn 1.2.0[23]
January 2023. scikit-learn 1.2.1[24]
March 2023. scikit-learn 1.2.2[25]

scikit-learn tools
mlpy
SpaCy
NLTK
Orange
PyTorch
TensorFlow
[Link]
List of numerical analysis software
[Link]

References
1. "Release 1.3.0" ([Link] 30 June 2023.
Retrieved 1 July 2023.
2. "The scikit-learn Open Source Project on Open Hub: Languages Page" ([Link]
[Link]/p/scikit-learn/analyses/latest/languages_summary). Open Hub. Retrieved 14 July
2018.
3. Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel; Bertrand Thirion;
Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss; Vincent Dubourg; Jake
Vanderplas; Alexandre Passos; David Cournapeau; Matthieu Perrot; Édouard Duchesnay
(2011). "scikit-learn: Machine Learning in Python" ([Link]
html). Journal of Machine Learning Research. 12: 2825–2830.
4. "NumFOCUS Sponsored Projects" ([Link] NumFOCUS.
Retrieved 2021-10-25.
5. Dreijer, Janto. "scikit-learn" ([Link]
6. "About us — scikit-learn 0.20.1 documentation" ([Link]
ory). [Link].
7. Eli Bressert (2012). SciPy and NumPy: an overview for developers ([Link]
m/books?id=fLKTuJqQLVEC&pg=PA43). O'Reilly. p. 43.
8. "The State of the Octoverse: machine learning" ([Link]
he-octoverse-machine-learning/). The GitHub Blog. GitHub. 2019-01-24. Retrieved
2019-10-17.
9. "Release history — scikit-learn 0.19.dev0 documentation" ([Link]
[Link]). [Link]. Retrieved 2017-02-27.
10. "Release History - 0.20.0 documentation" ([Link]
sion-0-20). scikit-learn. Retrieved 6 November 2018.
11. "Release History - 0.21.0 documentation" ([Link]
sion-0-21-0). scikit-learn. Retrieved 5 May 2019.
12. "Release History - 0.22 documentation" ([Link]
scikit-learn. Retrieved 7 June 2020.
13. "Release History - 0.23.0 documentation" ([Link]
#version-0-23-0). scikit-learn. Retrieved 7 June 2020.
14. "Release History - 0.24 documentation" ([Link]
scikit-learn, retrieved 2021-02-08
15. "Release History - 1.0.0 documentation" ([Link]
rsion-1-0-0). scikit-learn.
16. "Release History - 1.0.0 documentation" ([Link]
rsion-1-0-0). scikit-learn.
17. "Release History - 1.0.1 documentation" ([Link]
rsion-1-0-1). scikit-learn.
18. "Release History - 1.0.2 documentation" ([Link]
scikit-learn.
19. "Release History - 1.1.0 documentation" ([Link]
rsion-1-1-0). scikit-learn.
20. "Release History - 1.1.1 documentation" ([Link]
rsion-1-1-1). scikit-learn.
21. "Release History - 1.1.2 documentation" ([Link]
rsion-1-1-2). scikit-learn.
22. "Release History - 1.1.3 documentation" ([Link]
scikit-learn.
23. "Release History - 1.2.0 documentation" ([Link]
rsion-1-2-0). scikit-learn.
24. "Release History - 1.2.1 documentation" ([Link]
rsion-1-2-1). scikit-learn.
25. "Release History - 1.2.2 documentation" ([Link]
scikit-learn.

External links
Official website ([Link]
scikit-learn ([Link] on GitHub

Retrieved from "[Link]

Common questions

Powered by AI

Scikit-learn began in 2007 as a Google Summer of Code project, initially developed by David Cournapeau. Since then, it has seen contributions from many developers including contributors from INRIA who took leadership in 2010. The library has evolved through numerous versions, with significant contributions from community developers and institutional support, such as that from NumFOCUS. This continuous improvement and community engagement have established scikit-learn as a premier library in machine learning, influencing its widespread adoption and ongoing development .

Scikit-learn is designed to interoperate seamlessly with NumPy, SciPy, and other Python libraries like Matplotlib and Pandas, allowing for streamlined data manipulation, numerical calculations, and data visualization. This integration enhances machine learning tasks by providing efficient handling of data, access to a wide range of functionalities from data preprocessing to model evaluation, and the ability to easily visualize results .

Being a NumFOCUS-sponsored project offers scikit-learn financial oversight and organizational support, aiding its long-term sustainability. This sponsorship helps ensure stable funding for development activities, infrastructural improvements, and community events, while also providing credibility and fostering an inclusive community around the project, thereby enhancing its growth and reliability .

Scikit-learn's adoption has been positively influenced by its New BSD License, which permits free use, distribution, and modification, encouraging both academic research and commercial applications. This permissive license lowers the barrier to entry for using the library in diverse contexts, fostering innovation and collaboration, hence broadening its user base and facilitating integration into proprietary solutions without legal constraints .

Cython is used in scikit-learn to wrap certain core algorithms like support vector machines for enhanced performance, as it compiles Python code to C for faster execution. Pure Python is used for its ease of readability and rapid development. However, Cython's complexity increases development time and may limit ease of contributions by the broader community. Conversely, while pure Python offers simpler modification and maintenance, it may lead to slower execution in computationally intensive tasks .

Scikit-learn's compatibility with major operating systems such as Linux, macOS, and Windows enables a broad spectrum of users to efficiently run and develop machine learning applications regardless of their platform preference. This cross-platform nature ensures accessibility and facilitates collaborative projects across different environments, enhancing its appeal and usability in both academic research and commercial development .

INRIA played a significant role in scikit-learn's development by providing leadership and resources starting in 2010, which led to the release of the library's first public version. This collaboration with INRIA enabled scikit-learn to gain credibility and facilitated its growth within the academic and industrial communities through heightened visibility and structured development efforts .

Extending scikit-learn methods presents challenges such as ensuring compatibility across Cython, C, and C++ while maintaining readability and simplicity of code, especially when involving performance-critical components. However, the usage of these languages enables significant advantages like faster computation speeds and integration ease with existing scientific libraries, making the library highly efficient for complex machine learning tasks .

David Cournapeau laid the groundwork for scikit-learn, developing its initial vision as a Google Summer of Code project, while Matthieu Brucher contributed through its use and enhancement during his thesis work. These foundational contributions set the technical and collaborative framework for scikit-learn, influencing its architecture and community-driven growth, thus shaping its evolving structure and widespread adoption .

Scikit-learn's inclusion of diverse algorithms such as support-vector machines and random forests offers comprehensive solutions for classification, regression, and clustering tasks in machine learning. These robust, versatile algorithms enhance utility by allowing practitioners to apply sophisticated techniques without reinventing the wheel, hence accelerating the development of predictive models across various domains .

You might also like