Skip to content

A tidy R interface to the USPS post calc and zone calc APIs

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

aedobbyn/postal

Repository files navigation

postal 📫

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Travis build status Coverage status AppVeyor Build Status CRAN_Status_Badge rstudio mirror downloads

Want an estimate of the price of sending a package somewhere via the US Postal Service? Need to get the USPS shipping zone between two zip codes?

Well, this is a 📦 for your 📦s.postalprovides a tidy interface to the USPS domesticzone calcandpost calc APIs.


im_not_an_owl


Installation

FromCRAN:

install.packages("postal")

The development version:

#install.packages( "devtools" )
devtools::install_github("aedobbyn/postal")

Postage Price Calculator

The single postage calculation function,fetch_mail,works for flat-rate envelopes and boxes (the kind you pick up at the post office and wrestle with until they fold into a box shape) as well as for packages, which vary by their weight and dimensions.

Currently only destinations in the US are supported.

Usage

Specify a 5-digit origin zip and destination zip, along with the date and time you’re going to be shipping ("today"and"now"are allowed). Other specifics are optional.

library(postal)

USPS offers many colorful options to handle all your shipping needs, which are included in the arguments tofetch_mail.So to answer the burning question…what if we wanted to ship live animals from Wyoming to Philly by ground on July 2 at 2:30pm in a nonrectangular package??

fluffy<-fetch_mail(origin_zip="88201",
destination_zip="19109",
shipping_date="2018-07-02",
shipping_time="14:30",
live_animals=TRUE,
ground_transportation_needed=TRUE,
pounds=42,
ounces=3,
length=12,
width=10,
height=7,
girth=5,
shape="nonrectangular",
verbose=FALSE)

When will it get there and how much will it cost?

fluffy%>%
dplyr::pull(delivery_day)
#> [1] "Mon, Jul 9"

fluffy%>%
dplyr::pull(retail_price)
#> [1] "$83.61"

Finally, the important questions have been answered.


General case

For a more usual case, we’ll send a 15lb package from Portland, Maine to Portland, Oregon. The response shows all shipping options along with their prices, dimensions, and delivery dates.

(mail<-fetch_mail(origin_zip="04101",
destination_zip="97211",
shipping_date="today",
shipping_time="now",
pounds=15,
type="package",
shape="rectangular",
show_details=TRUE)) %>%
dplyr::slice(1:3)
#> Using ship on date 2018-07-30.
#> Using ship on time 18:18.
#> # A tibble: 3 x 10
#> origin_zip dest_zip title delivery_day retail_price click_n_ship_pr…
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 04101 97211 Priorit… Tue, Jul 31… $114.50 $114.50
#> 2 04101 97211 Priorit… Tue, Jul 31… $114.50 $114.50
#> 3 04101 97211 Priorit… Tue, Jul 31… $119.50 $119.50
#> #... with 4 more variables: dimensions <chr>, delivery_option <chr>,
#> # shipping_date <chr>, shipping_time <chr>

mail%>%
dplyr::slice(1:3) %>%
knitr::kable()
origin_zip dest_zip title delivery_day retail_price click_n_ship_price dimensions delivery_option shipping_date shipping_time
04101 97211 Priority Mail Express 1-Day™ Tue, Jul 31 by 3:00 PM $114.50 $114.50 Normal Delivery Time 2018-07-30 18:18
04101 97211 Priority Mail Express 1-Day™ Tue, Jul 31 by 10:30 AM $114.50 $114.50 Hold For Pickup 2018-07-30 18:18
04101 97211 Priority Mail Express 1-Day™ Tue, Jul 31 by 10:30 AM $119.50 $119.50 10:30 AM Delivery 2018-07-30 18:18

The web interface should display the same results:

post_calc

fetch_mailis a good option if you want to display data in the way USPS does. If you want to compute on prices and dates, you can tidy the dataframe by sending it intoscrub_mail.

scrub_mailreplaces"Not available"s and empty strings withNAs, changes prices to numeric, splits delivery day into a date and time of day (we infer year by the current year and use the 24hr clock), and computes the delivery duration in days.

mail%>%
scrub_mail() %>%
dplyr::slice(1:3) %>%
dplyr::select(
delivery_date,delivery_by_time,
delivery_duration,retail_price,
click_n_ship_price,dplyr::everything()
)
#> # A tibble: 3 x 12
#> delivery_date delivery_by_time delivery_duration retail_price
#> <date> <chr> <time> <dbl>
#> 1 2018-07-31 15:00 1 114.
#> 2 2018-07-31 10:30 1 114.
#> 3 2018-07-31 10:30 1 120.
#> #... with 8 more variables: click_n_ship_price <dbl>, origin_zip <chr>,
#> # dest_zip <chr>, title <chr>, dimensions <chr>, delivery_option <chr>,
#> # shipping_date <chr>, shipping_time <chr>

Multiple inputs and error handling

These functions work on a single origin and single destination, but multiple can be mapped into a tidy dataframe. Important parts of the request (origin_zip,destination_zip,shipping_date,and shipping_time) are included in the result, making it easier to distinguish different inputs from one another.

By default we try the API 3 times before giving up. You can modify that by changingn_tries.If aftern_trieswe still have an error (here, "foo"and"bar"are not good zips), a"no_success"row is returned so that we don’t error out on the first failure.

origins<-c("11238","foo","60647","80222")
destinations<-c("98109","94707","bar","04123")

purrr::map2_dfr(
origins,destinations,
fetch_mail,
type="box",
n_tries=3,
verbose=FALSE
)
#> Warning in.f(.x[[i]],.y[[i]],...): Zip codes supplied must be 5 digits.
#> Error on request. Beginning try 2 of 3.
#> Error on request. Beginning try 3 of 3.
#> Unsuccessful grabbing data for the supplied arguments.
#> Warning in.f(.x[[i]],.y[[i]],...): Zip codes supplied must be 5 digits.
#> Error on request. Beginning try 2 of 3.
#> Error on request. Beginning try 3 of 3.
#> Unsuccessful grabbing data for the supplied arguments.
#> # A tibble: 14 x 9
#> origin_zip dest_zip title delivery_day retail_price click_n_ship_pr…
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 11238 98109 Priorit… Thu, Aug 2 $18.90 $18.90
#> 2 11238 98109 Priorit… Thu, Aug 2 Not availab… $18.90
#> 3 11238 98109 Priorit… Thu, Aug 2 $13.65 $13.65
#> 4 11238 98109 Priorit… Thu, Aug 2 Not availab… $13.65
#> 5 11238 98109 Priorit… Thu, Aug 2 $7.20 $7.20
#> 6 11238 98109 Priorit… Thu, Aug 2 Not availab… $7.20
#> 7 foo 94707 no_succ… no_success no_success no_success
#> 8 60647 bar no_succ… no_success no_success no_success
#> 9 80222 04123 Priorit… Thu, Aug 2 $18.90 $18.90
#> 10 80222 04123 Priorit… Thu, Aug 2 Not availab… $18.90
#> 11 80222 04123 Priorit… Thu, Aug 2 $13.65 $13.65
#> 12 80222 04123 Priorit… Thu, Aug 2 Not availab… $13.65
#> 13 80222 04123 Priorit… Thu, Aug 2 $7.20 $7.20
#> 14 80222 04123 Priorit… Thu, Aug 2 Not availab… $7.20
#> #... with 3 more variables: dimensions <chr>, shipping_date <chr>,
#> # shipping_time <chr>

Similarly, if a response is received but no mail services are found, a dataframe with missing values is returned.

fetch_mail(origin_zip="04101",
destination_zip="97211",
shipping_date="3018-07-04",#way in the future!
type="package",
show_details=TRUE)
#> Using ship on time 18:19.
#> No Mail Services were found for this request. Try modifying the argument inputs.
#> # A tibble: 1 x 10
#> origin_zip dest_zip title delivery_day retail_price click_n_ship_price
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 04101 97211 <NA> <NA> <NA> <NA>
#> #... with 4 more variables: dimensions <chr>, delivery_option <chr>,
#> # shipping_date <chr>, shipping_time <chr>

This approach makes takes care of much of the try-catching you might have to implement, with the aim of making it easier to request a lot of data in one go.



Zones

Zones! Azoneis arepresentation of distance between the origin and the destination zip codes. Zones are used in determining postage rates and delivery times.

Sometimes you just need to know the shipping zone between your origin and destination. Or maybe betweenallorigins andalldestinations for some app you’re building.

That doesn’t sound so bad, but there are99999^2or 9,999,800,001 possible 5-digit origin-destination zip combinations in the US. The USPS Zone Calctool narrows down that space a bit by trimming zips to their first 3 digits. Every 5 digit zip’s information is defined by its 3-digit prefix, except for 5-digit exceptions, which are noted.

Usage

fetch_zones_three_digitlets you find the zone corresponding to a 3-digit origin zip prefix and one or many 3-digit destination zip prefixes.

fetch_zones_three_digit(origin_zip="123",
destination_zip="581")
#> # A tibble: 1 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 123 581 6

If no destination is supplied, all valid desination zips and zones are returned for the origin.

fetch_zones_three_digit(origin_zip="321")
#> # A tibble: 2,422 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 321 005 5
#> 2 321 006 6
#> 3 321 007 6
#> 4 321 008 6
#> 5 321 009 6
#> 6 321 010 5
#> 7 321 011 5
#> 8 321 012 5
#> 9 321 013 6
#> 10 321 014 6
#> #... with 2,412 more rows

Multiple zips

You can provide a vector of zips and map them nicely into a long dataframe. Here we ask for all destination zips for these three origin zips.

If an origin zip is supplied that isnot in use,it is messaged and included in the output withNAs in the other columns. For example, the origin"001"is not a valid 3-digit zip prefix.

origin_zips<-c("001","271","828")

origin_zips%>%
purrr::map_dfr(fetch_zones_three_digit)
#> Origin zip 001 is not in use.
#> # A tibble: 4,845 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 001 <NA> <NA>
#> 2 271 005 4
#> 3 271 006 7
#> 4 271 007 7
#> 5 271 008 7
#> 6 271 009 7
#> 7 271 010 4
#> 8 271 011 4
#> 9 271 012 4
#> 10 271 013 4
#> #... with 4,835 more rows

Similarly, map over both origin and destination zips and end up at a dataframe.verbosegives you a play-by-play if you want it. (More on auto-prepending leading 0s to input zips in theOn Digitssection below.)

dest_zips<-c("867","53","09")

purrr::map2_dfr(origin_zips,dest_zips,
fetch_zones_three_digit,
verbose=TRUE)
#> Grabbing origin ZIP 001
#> Origin zip 001 is not in use.
#> Making 53 into 053
#> Grabbing origin ZIP 271
#> Recieved 994 destination ZIPs for 8 zones.
#> Making 09 into 009
#> Grabbing origin ZIP 828
#> Recieved 994 destination ZIPs for 8 zones.
#> # A tibble: 3 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 001 <NA> <NA>
#> 2 271 053 5
#> 3 828 009 8



Ranges and other features

The USPS zone calc web interface displays zones only as they pertain to destination zip coderanges:

zone_calc


If you prefer the range representation, you can setas_range = TRUE. Instead of adest_zipcolumn, you’ll get a marker of the beginning of and end of the range indest_zip_startanddest_zip_end.

fetch_zones_three_digit("42","42",
as_range=TRUE)
#> # A tibble: 1 x 4
#> origin_zip dest_zip_start dest_zip_end zone
#> <chr> <chr> <chr> <chr>
#> 1 042 039 043 1

Details

You can optionally display other details about the zips, zones, and type of postage the zone designation applies to.

fetch_zones_three_digit(origin_zip="404",
show_details=TRUE)
#> # A tibble: 2,422 x 6
#> origin_zip dest_zip zone specific_to_prior… same_ndc has_five_digit_e…
#> <chr> <chr> <chr> <lgl> <lgl> <lgl>
#> 1 404 005 4 FALSE FALSE FALSE
#> 2 404 006 7 FALSE FALSE FALSE
#> 3 404 007 7 FALSE FALSE FALSE
#> 4 404 008 7 FALSE FALSE FALSE
#> 5 404 009 7 FALSE FALSE FALSE
#> 6 404 010 5 FALSE FALSE FALSE
#> 7 404 011 5 FALSE FALSE FALSE
#> 8 404 012 5 FALSE FALSE FALSE
#> 9 404 013 5 FALSE FALSE FALSE
#> 10 404 014 5 FALSE FALSE FALSE
#> #... with 2,412 more rows

Definitions of these details can be found inzone_detail_definitions.

zone_detail_definitions%>%
knitr::kable()
name digit_endpoint definition
specific_to_priority_mail 3, 5 This zone designation applies to Priority Mail only.
same_ndc 3, 5 The origin and destination zips are in the same Network Distribution Center.
has_five_digit_exceptions 3 This 3 digit destination zip prefix appears at the beginning of certain 5 digit destination zips that correspond to a different zone.
local 5 Is this a local zone?
full_response 5 Prose API response for these two 5-digit zips.

On Digits

The API endpoint used infetch_zones_three_digitaccepts exactly 3 digits for the origin zip; it mostly returns 3 digit destination zips, but also some 5 digit exceptions. For that reason,

  • Iffewer than 3 digitsare supplied, leading zeroes are added with a message
    • e.g."8"becomes"008"
  • Ifmore than 5 digitsare supplied, the zip is truncated to the first 5 with a warning
    • If the zip is an origin, only the first 3 of those 5 digits are sent to the API
    • If the zip is a destination, theexact_destinationflag determines whether we results for the that destination’s 3-digit prefix filter or filter to only the exact 5-digit destination

For example, when a 5-digit destination is supplied and exact_destinationisFALSE,we include results for the destination 962as well as for the exact one supplied,96240.

fetch_zones_three_digit(origin_zip="12358132134558",
destination_zip="96240",
exact_destination=FALSE)
#> Warning in prep_zip(., verbose = verbose): Zip can be at most 5 characters;
#> trimming 12358132134558 to 12358.
#> # A tibble: 2 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 123 962 8
#> 2 123 96240 5

Whenexact_destinationisTRUE,we filter only to96240,which is a 5 digit exception as its zone is different from its 3-digit prefix’s.

fetch_zones_three_digit(origin_zip="12358132134558",
destination_zip="96240",
exact_destination=TRUE)
#> Warning in prep_zip(., verbose = verbose): Zip can be at most 5 characters;
#> trimming 12358132134558 to 12358.
#> # A tibble: 1 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 123 96240 5

I just want to supply 5 digits

fetch_zones_three_digitshould cover most 5 digit cases and supply the most information whenshow_detailsisTRUE.But if you just want to use the equivalent of the“Get Zone for ZIP Code Pair”tab, you can use fetch_zones_five_digit.

fetch_zones_five_digit("31415","92653")
#> # A tibble: 1 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 31415 92653 8

Details given whenshow_details = TRUEinfetch_zones_five_digitare slightly different than they are forfetch_zones_three_digit(see Details).


All of the data

If you want the most up-to-date zip-zone mappings,fetch_allallows you to use the 3 digit endpoint to fetch all possible origins and, optionally, write them to a CSV as you go.

By default we use every possible origin from"000"to"999";as of now"000"through"004"are all not in use along with a smattering of others like"404"and"867"– but who knows, they might be used in the future.

fetch_all(all_possible_origins,
sleep_time=0.5,#How long to sleep in between requests, on average
write_to="path/to/my/file.csv")

If there’s a network error when grabbing a zip, we back off and try a few times and finally write"no_success"(rather thanNAs which indicate that the origin zip is not in use) in the destination zip columns.

What that looks like in the event we switch on the internet between asking for origin"456"and origin"789":

#> # A tibble: 9 x 3
#> origin_zip dest_zip zone
#> <chr> <chr> <chr>
#> 1 123 no_success no_success
#> 2 456 no_success no_success
#> 3 789 005 7
#> 4 789 006 8
#> 5 789 007 8
#> 6 789 008 8
#> 7 789 009 8
#> 8 789 010 7
#> 9.........

The entire set is also made available from a read-only MySQL database, which you can connect to with these creds:

host: knotsql.cimbccxns4ka.us-east-2.rds.amazonaws
port: 3306
database: master
user: public
password: password

Or some of it, for free

Free as in even less effort than the free as in beer stuff up there.

Thezips_zones_sampledataset included in this package contains a random sample of 1,000,000 rows of all the 3 digit origin-destination pairs. Load it with:

data(zips_zones_sample)

It’s what you’d get by runningfetch_all(show_details = TRUE),waiting a while, and then taking a sample.

zips_zones_sample
#> # A tibble: 1,000,000 x 6
#> origin_zip dest_zip zone specific_to_prior… same_ndc has_five_digit_e…
#> <chr> <chr> <int> <lgl> <lgl> <lgl>
#> 1 003 <NA> NA NA NA NA
#> 2 004 <NA> NA NA NA NA
#> 3 005 012 2 FALSE FALSE FALSE
#> 4 005 027 2 FALSE FALSE FALSE
#> 5 005 028 2 FALSE FALSE FALSE
#> 6 005 030 3 FALSE FALSE FALSE
#> 7 005 042 3 FALSE FALSE FALSE
#> 8 005 044 4 FALSE FALSE FALSE
#> 9 005 051 3 FALSE FALSE FALSE
#> 10 005 053 3 FALSE FALSE FALSE
#> #... with 999,990 more rows

The sample is about a quarter of the total number of rows between all origin prefixes and all destination prefixes, plus the 5 digit exceptions (~4m rows). See it put to use in the vignette.


That’s it!Bug reportsand PRs welcome! 📬

catch_mailman

About

A tidy R interface to the USPS post calc and zone calc APIs

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages