How to classify a tree

How to Identify Different Types of Trees

Whether you're taking a walk in the park or simply admiring your neighbor's landscape, it's nice to be able to identify different tree species. Who knows? You might want to plant a few of them in your own yard. If you're ready for some fun sleuth work, here's what to look for.

1 / 12

1: Brzostowska/Shutterstock /2: Peter Turner Photography/Shutterstock /3: Aleoks/Shutterstock

Leaf Identification Type

The starting point for most people when identifying trees species is the leaves. There are three basic leaf types: needles, scales and broadleaf. Most evergreens have needles or scales, while most broadleaf trees are deciduous, meaning they drop their leaves when dormant. However, there are exceptions. Larch has green needles that turn color in fall and drop off the tree. Live oak is an evergreen tree with broad, elliptical leaves.

Try the Arbor Day Foundation’s online tree identification tool.

2 / 12

Le Do/Shutterstock

How to Identify Trees By Leaf Shape

The shape of a leaf can also give clues when identifying broadleaf tree species. Common leaf identification shapes include ovate (egg shaped), lanceolate (long and narrow), deltoid (triangular), obicular (round) and cordate (heart shaped). There is also the palm-shaped maple leaf and the lobed oak leaf, two of our most recognizable leaf shapes.

Meet 11 trees with great fall foliage.

3 / 12

Family Handyman, Getty Images (5)

How to Identify Trees By Bark Color

Ask most people to describe a tree’s bark and they’ll say “gray” or “brown” and leave it at that. While many tree species indeed have gray bark, some have bark that is cinnamon (mulberry), pure white (birch), silver (beech), greenish white (aspen) or copper (paperbark maple) in color.

4 / 12

1: Tooykrub/Shutterstock/ 2: /Dwight Lee/Shutterstock /3: J Need/Shutterstock

How to Identify Trees By Bark Texture

There are many variations in texture between different tree species, as well. Bark can be furrowed (cottonwood), scaly (sycamore), peeling (hickory), smooth (beech), shiny (cherry), papery (birch) or warty (hackberry).

Learn how to make a shade tree thrive.

5 / 12

Luke Miller/Oldsmobile Trees

Bark Variations With Age

Often the color and texture of the bark change as the tree matures. This is most noticeable on the trunk—the oldest part of the tree. Silver maple, for example, will go from smooth and silver to furrowed and gray and black as it grows older, as the photo shows.

6 / 12


How to Identify Trees By Tree Shape

Some trees have a distinctive shape. Think of the vaselike habit of an American elm tree or the pyramid silhouette of a sweet gum. In some cases, the habit changes as the tree matures—often becoming more rounded or irregular—but shape can help with identifying younger trees that are grown in open space (as opposed to a wooded setting, which encourages taller, narrower growth).

You can train a tree’s shape to your liking. See our tree pruning techniques.

7 / 12

1: m.bonotto/Shutterstock / 2: Roman Kutsekon/Shutterstock

Tree Size and Location: What tree is this?

If you’re trying to identify trees species in a natural setting, you can study the site. Nature knows what it’s doing, distributing trees where they will thrive. Some species, such as willow, are more likely to grow near water. While others, such as black locust, are more upland tree species. A mature tree’s size can also help you whittle down the possibilities. If it’s 60 feet tall and 40 feet wide, you know it’s more apt to be an oak than a dogwood.

Not sure there’s room for a tree in your yard? Meet some space-saving trees for today’s smaller gardens.

8 / 12

Photo and Vector/Shutterstock

How to Identify Trees By Flowers

While there’s a whole class known as flowering trees (everything from crabapples to magnolias), other tree species have inconspicuous flowers. Either way, flowers can help with identification. First, consider the color (although this isn’t a fail-safe method, since plant breeders have expanded the color palette in the cultivars they have developed). More helpful is to consider when the flower appears and what it looks like. Flower types include single blooms, clustered blooms or catkins (pictured), which are dense hanging spikes that look like tassels. Many trees bloom in spring, but some flower in summer or even early fall, helping you eliminate certain tree species as you investigate.

9 / 12


How to Identify Trees By Fruit Type

When you think of fruit, you probably think of larger fleshy fruits with seeds inside (apples, pears). But fruit is just a seed dispersal mechanism, so there are other variations to consider. Think of the papery winged fruits of maple, the nuts of chestnut, the acorns of oak, the catkins of willow, the berries of hawthorn and the cones of alder (pictured). All can help you pinpoint a tree species.

10 great trees to consider planting in your yard.

10 / 12


How to Identify Trees By Seed Comparison

The seeds themselves can help with more specific identification. Say you have an oak tree but you’re not sure what kind. Leaf shape is highly variable on oaks, even on the same specimen. A better indicator may be the acorns. Get your hands on a good guide such as The Audubon Society Field Guide to North American Trees (a mainstay in bookstores for decades). Then compare the acorns to what’s pictured in the guide. You’ll find that acorns can be small (black oak), big (bur oak), oblong (English oak) or barrel shaped (red oak). Some are even striped (pin oak). The cap that partially encases an acorn is also unique in size, shape and texture.

11 / 12

1: DmitryKomarov/Shutterstock /2: Burakova_Yulia/Shutterstock /3: ANGHI/Shutterstock

How to Identify Trees By Leaf Bud Arrangement

Buds can be helpful in identifying tree species in winter, when deciduous trees are without foliage. Those at the end of a twig are called terminal buds, while those growing along the twig are lateral buds. The arrangement of these lateral buds can help establish a tree’s identity. Alternate buds, found on elms, are arranged in alternating pairs on opposite sides of the stem. The opposite buds of maple are directly facing each other on the stem. And spiral buds whorl alternately around the stem, as seen on oaks.

12 / 12

Luke Miller/Oldsmobile Trees

How to Identify Trees By Leaf Bud Appearance

Some trees have distinctive buds, such as the sharply pointed buds of beech and the small, clustered buds of oak, which are covered by protective scales. Bitternut hickory is hard to miss—just look for the sulfur-yellow buds when the tree is dormant.

Thinking of planting a tree? Don’t make these tree-planting mistakes.

Originally Published: June 10, 2019

Luke Miller

Luke Miller is an award-winning garden editor with 25 years' experience in horticultural communications, including editing a national magazine and creating print and online gardening content for a national retailer. He grew up across the street from a park arboretum and has a lifelong passion for gardening in general and trees in particular. In addition to his journalism degree, he has studied horticulture and is a Master Gardener.

Classification Tree | solver


Classification tree methods (i.e., decision tree methods) are recommended when the data mining task contains classifications or predictions of outcomes, and the goal is to generate rules that can be easily explained and translated into SQL or a natural query language.

A Classification tree labels, records, and assigns variables to discrete classes. A Classification tree can also provide a measure of confidence that the classification is correct.

A Classification tree is built through a process known as binary recursive partitioning. This is an iterative process of splitting the data into partitions, and then splitting it up further on each of the branches.

Initially, a Training Set is created where the classification label (i.e., purchaser or non-purchaser) is known (pre-classified) for each record. Next, the algorithm systematically assigns each record to one of two subsets on the some basis (i.e., income > $75,000 or income <= $75,000). The object is to attain an homogeneous set of labels (i.e., purchaser or non-purchaser) in each partition. This partitioning (splitting) is then applied to each of the new partitions. The process continues until no more useful splits can be found. The heart of the algorithm is the rule that determines the initial split rule (displayed in the following figure).


The process starts with a Training Set consisting of pre-classified records (target field or dependent variable with a known class or label such as purchaser or non-purchaser). The goal is to build a tree that distinguishes among the classes. For simplicity, assume that there are only two target classes, and that each split is a binary partition. The partition (splitting) criterion generalizes to multiple classes, and any multi-way partitioning can be achieved through repeated binary splits. To choose the best splitter at a node, the algorithm considers each input field in turn. In essence, each field is sorted. Every possible split is tried and considered, and the best split is the one that produces the largest decrease in diversity of the classification label within each partition (i.e., the increase in homogeneity). This is repeated for all fields, and the winner is chosen as the best splitter for that node. The process is continued at subsequent nodes until a full tree is generated.

XLMiner uses the Gini index as the splitting criterion, which is a commonly used measure of inequality. The index fluctuates between a value of 0 and 1. A Gini index of 0 indicates that all records in the node belong to the same category. A Gini index of 1 indicates that each record in the node belongs to a different category. For a complete discussion of this index, please see Leo Breiman’s and Richard Friedman’s book, Classification and Regression Trees (3).

Pruning the Tree

Pruning is the process of removing leaves and branches to improve the performance of the decision tree when moving from the Training Set (where the classification is known) to real-world applications (where the classification is unknown). The tree-building algorithm makes the best split at the root node where there are the largest number of records, and considerable information. Each subsequent split has a smaller and less representative population with which to work. Towards the end, idiosyncrasies of training records at a particular node display patterns that are peculiar only to those records. These patterns can become meaningless for prediction if you try to extend rules based on them to larger populations.

For example, if the classification tree is trying to predict height and it comes to a node containing one tall person X and several other shorter people, the algorithm decreases diversity at that node by a new rule imposing people named X are tall, and thus classify the Training Data. In the real world, this rule is obviously inappropriate. Pruning methods solve this problem -- they let the tree grow to maximum size, then remove smaller branches that fail to generalize. (Note: Do not include irrelevant fields such as name, as this is simply used an illustration. )

Since the tree is grown from the Training Set, when it has reaches full structure it usually suffers from over-fitting (i.e., it is explaining random elements of the Training Data that are not likely to be features of the larger population of data). This results in poor performance on data. Therefore, trees must be pruned using the Validation Set.

Ensemble Methods

XLMiner V2015 offers three powerful ensemble methods for use with Classification trees: bagging (bootstrap aggregating), boosting, and random trees. The Classification Tree Algorithm on its own can be used to find one model that results in good classifications of the new data. We can view the statistics and confusion matrices of the current classifier to see if our model is a good fit to the data, but how would we know if there is a better classifier waiting to be found? The answer is that we do not know if a better classifier exists. However, ensemble methods allow us to combine multiple weak classification tree models that, when taken together form a new, more accurate strong classification tree model. These methods work by creating multiple diverse classification models, taking different samples of the original data set, and then combining their outputs. Outputs may be combined by several techniques for example, majority vote for classification and averaging for regression. This combination of models effectively reduces the variance in the strong model. The three different type of ensemble methods offered in XLMiner (bagging, boosting, and random trees) differ on three items: 1) the selection of a Training Set for each classifier or weak model; 2) how the weak models are generated; and 3) how the outputs are combined. In all three methods, each weak model is trained on the entire Training Set to become proficient in some portion of the data set.

Bagging (bootstrap aggregating) was one of the first ensemble algorithms to be documented. It is a simple, effective algorithm. Bagging generates several Training Sets by using random sampling with replacement (bootstrap sampling), applies the classification tree algorithm to each data set, then takes the majority vote between the models to determine the classification of the new data. The biggest advantage of bagging is the relative ease with which the algorithm can be parallelized, which makes it a better selection for very large data sets.

Boosting builds a strong model by successively training models to concentrate on the misclassified records in previous models. Once completed, all classifiers are combined by a weighted majority vote. XLMiner offers three different variations of boosting as implemented by the AdaBoost algorithm (ensemble algorithm): M1 (Freund), M1 (Breiman), and SAMME (Stagewise Additive Modeling using a Multi-class Exponential).

Adaboost.M1 first assigns a weight (wb(i)) to each record or observation. This weight is originally set to 1/n, and is updated on each iteration of the algorithm. An original classification tree is created using this first Training Set (Tb) and an error is calculated as


where, the I() function returns 1 if true, and 0 if not.

The error of the classification tree in the bth iteration is used to calculate the constant ?b. This constant is used to update the weight (wb(i). In AdaBoost.M1 (Freund), the constant is calculated as

αb= ln((1-eb)/eb)

In AdaBoost.M1 (Breiman), the constant is calculated as

αb= 1/2ln((1-eb)/eb)

In SAMME, the constant is calculated as

αb= 1/2ln((1-eb)/eb + ln(k-1) where k is the number of classes

where, the number of categories is equal to 2, SAMME behaves the same as AdaBoost Breiman.

In any of the three implementations (Freund, Breiman, or SAMME), the new weight for the (b + 1)th iteration will be


Afterwards, the weights are all readjusted to sum to 1. As a result, the weights assigned to the observations that were classified incorrectly are increased and the weights assigned to the observations that were classified correctly are decreased. This adjustment forces the next classification model to put more emphasis on the records that were misclassified. (The ? constant is also used in the final calculation, which will give the classification model with the lowest error more influence. ) This process repeats until b = Number of weak learners (controlled by the User). The algorithm then computes the weighted sum of votes for each class and assigns the winning classification to the record. Boosting generally yields better models than bagging; however, it does have a disadvantage as it is not parallelizable. As a result, if the number of weak learners is large, boosting would not be suitable.

Random trees (i.e., random forests) is a variation of bagging. This method works by training multiple weak classification trees using a fixed number of randomly selected features (sqrt[number of features] for classification, and a number of features/3 for prediction), then takes the mode of each class to create a strong classifier. Typically, in this method the number of “weak” trees generated could range from several hundred to several thousand depending on the size and difficulty of the training set. Random Trees are parallelizable since they are a variant of bagging. However, since Random Trees selects a limited amount of features in each iteration, the performance of random trees is faster than bagging.

Classification Tree Ensemble methods are very powerful methods, and typically result in better performance than a single tree. This feature addition in XLMiner V2015 provides more accurate classification models and should be considered over the single tree method.



Classification of trees

But classification is not an end in itself, but a means to an end. The challenge now is to uncover more fully than before the biological features of trees, making the most of the achievements of materialistic biological science, and reflect them in the classification for the purpose of care. At the same time, one should not forget about the economic side of the matter, for the sake of which this task is posed. Only such a classification of trees will be vital and useful, which will allow you to correctly approach the economic evaluation of trees during thinning, to get the greatest economic effect.[ ...]

Classification of trees (according to Shedelin, 1972)

In the further development of tree classification for thinning, it is necessary to build it on a combination of biological and economic approaches, taking into account geographical conditions, the main features of the forest (even-aged, uneven-aged, simple and complex, mixed and pure, etc. ). Classification, giving the external differences of trees, should reveal both their purpose and potential opportunities in given natural conditions for solving the program task - growing a forest of a certain nature by the time of the main felling. In other words, along with general classifications, it is advisable to develop local classifications of trees.[ ...]

As an example, here is one of the tree classification schemes (p. 97). It uses a variety of habitual features that clearly reflect the connection between life forms and living conditions (the definitions “forest”, “savannah”, “subarctic”, etc. had to be included in the name of the groups).[ ...]

The principle is numerical, including decimal, the classification of trees is also used in our country. A seven-digit numerical indicator of trees for selective felling was proposed by A.I. Zviedris (1956).[ ...]

In France, in the last century, a classification of trees was developed for the care of oak. She distinguishes three classes of trees: trees of the future (chosen ones) - large trees with the best trunk shape, which are being cared for, trees, although large, but less technically valuable, interfering with the chosen ones, suppressing their growth, to be cut down, trees that are useful for the chosen ones - being in the lower tier, they serve as a "fur coat", help to clean the trunks from branches, etc. [ ...]

Swiss scientist Schedelin (W. Scha [...]

Denmark has accumulated a lot of experience in caring for beech - a classification of trees has been developed for it, and frequent repetition of care is also required. In Russia, you can use some elements of this classification, since the originality beech forests in our country requires its own approaches.It is important to establish, record and evaluate the relationship of beech with such valuable species as fir and spruce in the Caucasus, spruce, oak, fir, maple and other species in the Carpathians, unique combinations of forest tree species in the Crimea and Moldavia, and on this basis, the refinement of the parameters of thinning in mixed and pure bucine. It is somewhat easier to resolve these issues in cases where beech is suppressed by less valuable species - birch, aspen, hornbeam, by removing them. However, one must take into account the shade tolerance of beech and prevent sharp reducing plant density.[ ...]

To carry out this felling method, a technique for assigning trees to felling was developed in detail (the order of their branding and measurement, accounting for their condition), a classification of trees with diagnostic features was given. When cutting down trees, the condition of the forest stand and soil is taken into account. Therefore, even overmature trees, if they are healthy, can be left on the vine. Deciduous trees in spruce forests are appointed for cutting carefully so as not to weaken the wind resistance of tree stands.[ ...]

In this regard, it is natural to search for the right solution, the desire for improvement and for a reasonable replacement of existing tree classifications.[ ...]

Thinning is the oldest section of forestry, and a lot of experience has been accumulated in world practice. In particular, a number of classifications of trees have been developed, materials on environmental changes in the forest have been obtained, the range of taxation indicators studied has been expanded, etc. There are also scientifically substantiated methods and techniques for specific conditions. However, in this area, science has not yet given answers to all questions.[ ...]

In recent years, foresters have made attempts to apply the theory of stage development to thinning, to build new classifications of trees on this basis. But so far, none of the newly proposed classifications has been developed to such an extent that it can already be recommended for widespread use in production.[ ...]

And among the foresters there were opportunists who began to speculate on staging. Along with serious attempts to extract some useful points, far-fetched schemes and classifications of trees appeared with recommendations for the selection of trees during thinning, sometimes contradicting common sense to the point that the best quality trees began to fall into the category of the worst and, conversely, the worst ones were built. to the rank of the best; but it sounded fashionable: “stage young” and “stage old”. Despite the "fashion" zealously promoted by the Lysenkoites, forestry practice did not accept these recommendations. However, in "innovations" it came to the point that new names and new authors were simply mechanically attached to the old established classifications. So, Kraft's classification suddenly ceased to belong to the famous German forester Kraft, turning into Nesterov's classification (Nesterov V. G. General forestry. 1949.[ ...]

Artificial selection is the main concern of the arborist during the entire period of forest care, regardless of the age of the stand. To carry it out with targeted forest growing and identifying trees that are positive in terms of resin productivity, it is necessary to classify trees according to a certain attribute. The classification of trees is the basis for the selection of trees with the necessary economically valuable traits.[ ...]

Sometimes, unfortunately, it also happens when, under loud headlines, under the guise of something new in forestry, well-known provisions are presented, and real innovation is replaced by superficial declarative ones. statements. For example, the classification of trees, widely known in the world and Russian forestry literature, developed by the German forester Kraft in the 80s of the last century63, 65 years later, at 1949 g, ceased to be a classification of craft 64. [...]

Classifier of arbitrary trees with training - reference

  • Brief information
  • Use
  • Example code
  • Wednesday parameters
  • Information about licenses


Generates an Esri classifier definition (. ecd) file using the Free Tree classification method.

The Random Tree Classifier is a powerful image classification engine that is resistant to overfitting and capable of working with segmented images and other complementary raster datasets. The tool accepts multi-band images of any bit depth as standard input images and performs arbitrary tree classification by pixels or segments based on the input training object file.


  • Random trees are a set of individual decision trees, where each tree is generated from different samples and subsets of the training data. The idea behind naming these decisions trees is that for each classified pixel, a number of decisions are made in order of importance. In the graphic representation of a pixel, this looks like a branch. When classifying the entire data set, these branches form a tree. This method is called random trees because the dataset is classified multiple times based on a random subset of training pixels, resulting in multiple decision trees. To make a final decision, each tree is assigned points. This is done to avoid overfitting. The Random Tree Classifier is a machine learning classifier based on building many decision trees, selecting random subsets of variables for each tree, and using the most frequently occurring results as the overall classification. The random tree method corrects for the propensity of decision trees to overfit their training samples. In this method, a large number of trees are created - similar to a forest - and the difference between the trees is introduced by projecting the training data into a randomly selected subspace before fitting each tree. The solution for each node is optimized by a random procedure.

  • For segmented rasters whose key property is Segmented, the tool calculates the index image and associated segment attributes from the segmented RGB raster. The attributes are computed to create a classifier definition file to be used in a standalone classification tool. Attributes for each segment can be computed for any Esri-supported image.

  • Any Esri-supported raster is accepted as input, including raster products, segmented rasters, mosaics, image services, or raster datasets in generic formats. Segmented rasters must be 8-bit with 3 bands.

  • To create a training sample file, use the Training Sample Manager on the Image Classification toolbar. For more information about using the Image Classification toolbar, see What is Image Classification?

  • The Segment Attributes option is only enabled when one of the input raster layers is a segmented image.


Raster Layer; Mosaic layer; image service; String


Select the training sample file or layer that defines the training sample regions.

These can either be shapefiles or feature classes that contain your training samples.

Feature Layer; Raster Catalog Layer


JSON file that contains attribute information, statistics, and other information needed for the classifier. A file with the .ecd extension is created.




Optionally include auxiliary raster datasets, such as a multispectral image or DTM, to create attributes and other information needed to create attributes and other information needed to create classification.

Raster Layer; Mosaic layer; image service; String



The maximum number of trees in the forest. Increasing the number of trees will lead to better estimation accuracy, but at some point these improvements will come to naught. The number of trees proportionally increases the processing time.




The maximum depth of each tree in the forest. Depth is another way of specifying the number of rules allowed for each tree that is created for the purpose of making a decision. Trees will not grow deeper than this setting.




The maximum number of samples to define each class.

When the input is non-segmented rasters, the default value of 1000 is recommended. A value less than or equal to 0 means that the system will use all samples from the training locations to train the classifier. used_attributes

  • COLOR —The RGB color values ​​obtained from the input raster on a per-segment basis.
  • MEAN —Mean numeric number (DN) derived from the additional pixel image, based on each segment.
  • STD —Standard deviation derived from the additional pixel image, based on each segment.
  • COUNT —The number of pixels that make up the segment, based on each segment.
  • COMPACTNESS —The degree to which a segment is compact or round, on a per-segment basis. Values ​​range from 0 to 1, where 1 represents a circle.
  • RECTANGULARITY —The extent to which a segment is rectangular, on a per-segment basis. Values ​​range from 0 to 1, where 1 represents a rectangle.

This option is active only when the Segmented key option is selected for the input raster. If only a segmented image is used as input to the tool, then the default attributes are COLOR, COUNT, COMPACTNESS, and RECTANGULARITY. If the in_additional_raster is also used as input along with the segmented image, then MEAN and STD will be available as options.


Sample code

TrainRandomTreesClassifier example 1 (Python window)

Sample Python script for the TrainRandomTreesClassifier tool.

Learn more