For my final project I created a population growth map for the Dallas/Fort Worth Metro Area.
Dallas/Fort Worth Metro Area is the fourth largest metropolitan
area in the United States and is the largest inland metro area. With a
population of 7.2 million people and a population growth of 12.5%, where will
all of these people live? The map I’ve created using current population
estimates and future growth estimates, should give a clear picture to where
people are moving within the Dallas/Fort Worth Metro.
Two thematic methods are used for this map The first method uses
graduated symbols to display 2017 population growth data. The data behind the population growth is
derived from the North Central Texas Council of Government which provides a
2017 Annual Population Estimate, from the spring of 2016 to the spring of 2017.
Each city’s population is calculated separately for different housing types and
an estimation for the population of each type of hosing unit is calculated. The
variation of household size and occupancy rates are taken into account for an
estimation of population of each housing unit type.
Using graduated symbols to display the 2017 Annual
Population Estimate data allows you to see the amount of growth as the size of
the symbol is proportional to the value of growth. Therefore, the largest
symbols represent the highest growth and smaller symbols represent little
change in growth. The population growth data was in excel forma and was joined
with a city boundary polygon layer. Using the city names to relate the table to
the city boundary layer, shows a geo location at the centroid of each city for
placement of the graduated symbols.
The 2017 Annual Population Estimate data was standardized by
calculating the percent of increase from 2017 to 2016 for each city’s
population estimate. By taking the 2017 population value and subtracting the 2016
value, gives you the amount of increase.
Once the increase is calculated, a percent is equated by taking the
population increase, divided by the past 2016 value and the multiplied by 100
to give a percent of increase. Standardizing the data by using percent of
increase allows the values to noticed for cities that have high growth because
it does not take into account the actual value or size of each city. As an
example, if you did not standardize the data and used only the 2017 population
totals, you will not obtain good results.
The values for higher population cities will display larger values and
not give you any idea to the amount of change in population. Therefore, the
data must be standardized using a percent increase to properly see the
graduated symbols appropriately.
The data classification used to show the 2017 Annual
Population Estimate data uses the Natural Breaks Classification and has 5 class
breaks. Most of the data falls near the beginning of the number line and data
tapers off in the middle with a few outliers far down the number line. These
outliers are necessary as they truly represent the highest values and are necessary
to see the high growth. Therefore, using Natural Breaks Classification allows
the class breaks to naturally divide at the points were data goes flat along
the number line but breaks out the high outliers appropriately. Some further
standardizing of the data was necessary in order to remove some outliers for
very small cities. These small cities had population values below 5000 and the
growth in some cities skewed the data for the large cities. The small cities
had extremely high percent of growth and does not compare well against the
large city population. So cities with a population below 4999 were removed.
The second method of data classification used is a choropleth
map, forecasting population growth estimates in the year 2040. A data set from
North Central Texas Council of Government was used to display the population
forecast. The dataset, 2040 Demographic Forecast uses methods of long-range
forecasting to give a view into where and how much growth will occur the metro
area. The forecasting uses a series of methods and formulas to calculate
projections and accounts for many variables of error. The data displayed in the
choropleth is broken up into districts and a percent value is used to show
values of low to high growth. Using a choropleth map for the data gives a good
view of where the population will grow but also it allows for the graduated symbols
to be placed over the layer in order to understand where and how much growth is
occurring.
The 2040 Demographic Forecast data was standardized using
population statistics from 2005 which was already present in the dataset. A
percent of increase was calculated based off the 2005 population statistics
against the projected 2040 population statistics. Percent of increase was again
calculated by dividing the increase against the original 2005 data and multiplied
by 100. This dataset came in a shape file format and the database file was
exported into excel to obtain the percent of increase. It was then exported as
an xls file and imported back to ArcMap, where a join was made back into the
original 2040 Demographic Forecast data. Again a percent of increase will show
population growth better than using raw total population counts. Therefore, it
is best to standardize this data.
Natural Breaks data classification was used to display 5
class breaks for percent of increase of the 2040 Demographic Forecast data.
Most of the data clumped at the beginning of the number line, with major
outliers falling at the middle and end of the number line. These outliers are
the most important breaks as they are the highest growing districts. So using
Natural Breaks data classification appropriately finds the minor breaks,
clumped at the beginning of the number, creating two breaks below the mean
value. Natural Breaks found three breaks above the mean value which breaks out
the data for the highest and more important values of data.
Other methods of data classification were experimented with
for each population layer. One such method is the Quintile data classification.
This method comes close to Natural Breaks with similar break out of classes but
clumps four breaks too early at the beginning of the number line. The four breaks
fall under the mean value, leaving one class to show rest of the data that
falls above the mean value. The classification does not show the data well as
the highest values fall flat into one class break.