Figures and Tables from this paper
- figure 1
- table 1
- figure 2
- figure 3
- figure 4
Topics
Field Programmable Gate Array (opens in a new tab)Random Forests (opens in a new tab)Architecture (opens in a new tab)Machine Learning (opens in a new tab)
22 Citations
- C. Pham-QuocTrung Pham-DinhBinh Kieu-Do-Nguyen
- 2024
Computer Science, Engineering
Journal of Advances in Information Technology
This paper presents an efficient architecture to perform random forest effectively for edge computing platforms based on Field Programmable Gate Array (FPGA) technology and proposes a sufficient structure for storing decision trees’ information for the execution of DTUs.
- PDF
- Milan ShahReece NeffHancheng WuMarco MinutoliAntonino TumeoM. Becchi
- 2022
Computer Science
ICPP
This work proposes a hierarchical memory layout suitable to the GPU/FPGA memory hierarchy, and designs three RF classification code variants based on that layout, and investigates GPU- and FPGA-specific considerations for these kernels.
- 1
- Maxim ShepovalovV. Akella
- 2020
Computer Science, Engineering
Integr.
- 6
- PDF
- Shuang ZhaoShuhui ChenHui YangFei WangZiling Wei
- 2021
Computer Science, Engineering
J. Parallel Distributed Comput.
- 6
- Binglei LouD. BolandPhilip H. W. Leong
- 2023
Computer Science, Engineering
ACM Trans. Reconfigurable Technol. Syst.
This article proposes a flexible computing architecture consisting of multiple partially reconfigurable regions, pblocks, which each implement anomaly detectors, and compares fSEAD to an equivalent central processing unit (CPU) implementation using four standard datasets, with speedups ranging from 3× to 8×.
- Hongfei WangJianwen LiKun He
- 2020
Computer Science
ACM Trans. Design Autom. Electr. Syst.
A hierarchical ensemble reduction and learning framework that bridges the gap between software-based ensemble methods and hardware computing in the IoT era and developed a novel conversion paradigm that supports the automatic deployment of >500 trees on a chip.
- 3
- Highly Influenced
- Narasinga Rao MiniskarAaron R. YoungFrank LiuW. BloklandA. CabreraJ. Vetter
- 2022
Engineering, Physics
2022 32nd International Conference on Field…
An FPGA design of an extremely low latency scientific machine learning application at the edge is presented and it is demonstrated that the implementation can achieve 60 nanoseconds of computing latency for complex random forest models with 10,000 input features.
- 3
- PDF
- Andrea DamianiEmanuele Del SozzoM. Santambrogio
- 2022
Computer Science, Environmental Science
2022 27th Asia and South Pacific Design…
Entree, the first automatic design flow for deploying the inference of Decision Tree (DT) ensembles over Field-Programmable Gate Arrays (FPGAs) at the network's edge, is proposed, which exploits dynamic partial reconfiguration on modern FPGA-enabled Systems-on-a-Chip (SoCs) to accelerate arbitrarily large DTEnsembles at a latency a hundred times stabler than software alternatives.
- 5
- Hongfei WangJianwen LiKun HeWenjie Cai
- 2018
Computer Science, Engineering
2018 International Conference on Hardware…
Due to their inherent structures, tree ensembles are ideal for exploiting the computational parallelism on FPGA, making them extremely potential for wearable devices and embedded systems at the edge nodes of the IoT.
- 3
- Highly Influenced
- M. ElnawawyA. SagahyroonT. Shanableh
- 2020
Computer Science, Engineering
IEEE Access
The suitability of packet-level and flow-level features is validated using stepwise regression and random forest feature selection and it is indicated that random forest outperforms other algorithms achieving a maximum accuracy of 98.5% and an F-score of 0.932.
- 25
- PDF
...
...
11 References
- C. ChengC. Bouganis
- 2013
Computer Science, Engineering
2013 23rd International Conference on Field…
A novel FPGA architecture for accelerating the RF training step is presented, exploring key features of the device and achieving speed-up factors up to 230x over a 3GHz Intel Core i5 processor when an Altera Stratix IV device is utilised under classification problems drawn from VOC2007.
- 20
- PDF
- B. V. EssenC. MacaraegM. GokhaleR. Prenger
- 2012
Computer Science, Engineering
2012 IEEE 20th International Symposium on Field…
FPGAs provide the highest performance solution, but require a multi-chip / multi-board system to execute even modest sized forests, while GP-GPUs offer a more flexible solution with reasonably high performance that scales with forest size, and multi-threading via Open MP on a shared memory system was the simplest solution.
- 158
- Highly Influential
- PDF
- J.R. Struharik
- 2011
Computer Science, Engineering
2011 IEEE 9th International Symposium on…
Experimental results obtained on 23 datasets of standard University of California Irvine (UCI) Machine Learning Repository database suggest that the proposed architecture based on the sequence of universal nodes requires on average 56% less hardware resources compared with the previously proposed architectures, having the same throughput.
- 39
- PDF
- R. NarayananDaniel HonboG. MemikA. ChoudharyJoseph Zambreno
- 2007
Computer Science, Engineering
DATE '07
This paper identifies the compute-intensive kernel (Gini Score computation) in the algorithm, and develops a highly efficient architecture, which is further optimized by reordering the computations and by using a bitmapped data structure.
- 70
- F. SaqibAindrik DuttaJ. PlusquellicPhilip OrtizM. Pattichis
- 2015
Computer Science, Engineering
IEEE Transactions on Computers
This work has devised a pipelined architecture for the implementation of axis parallel binary DTC that dramatically improves the execution time of the algorithm while consuming minimal resources in terms of area, and is 3.5 times faster than the existing hardware implementation of classification.
- 74
- PDF
- Shanker ShreejithBezborah AnshumanSuhaib A. Fahmy
- 2016
Computer Science, Engineering
It is shown that the hybrid platform outperforms an optimised software implementation on an automotive grade ARM Cortex M4 processor in terms of latency and power consumption, also providing better consolidation.
- 12
- PDF
- L. Breiman
- 2001
Computer Science, Mathematics
Machine Learning
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
- 87,608
- PDF
- Xuanle RenMitchell MartinShawn Blanton
- 2015
Computer Science, Engineering
2015 IEEE 33rd VLSI Test Symposium (VTS)
A novel incremental-learning algorithm, namely dynamic k-nearest-neighbor (DKNN), is developed to improve the accuracy of on-chip diagnosis, which employs online diagnosis data to update the learned classifier so that the classifier can keep evolving as new diagnosis data becomes available.
- 14
- PDF
- Thomas G. Dietterich
- 2000
Computer Science, Mathematics
Lecture Notes in Computer Science
Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.
- 6,308
- PDF
- Xuanle RenV. TavaresShawn Blanton
- 2015
Computer Science, Engineering
A JTAG protection scheme, SLIC-J, is proposed to monitor user behavior and detect illegitimate accesses to the JTAG, which is characterized using a set of specifically-defined features, and then an on-chip classifier is used to predict whether the user is legitimate or not.
- 19
- Highly Influential
- PDF
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers