我有一个形状为(699394, 3)
的数据框。以下是一个小样本df
。
df = pd.DataFrame({'name': ['Bullet', 'Gauge', 'MFG Brand Name', 'Material', 'Number of Pieces', 'Product Depth (in.)', 'Product Height (in.)', 'Product Weight (lb.)', 'Product Width (in.)', 'Application Method', 'Assembled Depth (in.)', 'Assembled Height (in.)', 'Assembled Width (in.)', 'Bullet', 'Cleanup', 'Color Family', 'Color/Finish', 'Concrete Use', 'Container Size', 'Coverage Area (sq. ft.)', 'Deck Use', 'Interior/Exterior', 'MFG Brand Name', 'Mildew Resistant', 'Opacity', 'Paint Product Type', 'Patching & Repair Product Type', 'Product Style', 'RGB Value', 'Sealer', 'Time before recoating (hours)', 'Tintable', 'Transparency', 'UV Resistant', 'Waterproof', 'Bath Faucet Type', 'Built-in Water Filter', 'Bullet', 'Certifications and Listings', 'Color Family', 'Color/Finish', 'Connection size (in.)', 'Faucet Features', 'Faucet Included Components', 'Faucet type', 'Flow rate (gallons per minute)', 'Handle type', 'MFG Brand Name', 'Number of Faucet Handles', 'Number of Spray Settings', 'Number of showerheads', 'Product Depth (in.)', 'Product Height (in.)', 'Product Width (in.)', 'Showerhead face diameter (in.)', 'Showerhead type', 'Spray Pattern', 'Appliance Type', 'Assembled Depth (in.)', 'Assembled Height (in.)', 'Assembled Width (in.)', 'Bullet', 'Capacity of Microwave (cu. ft.)', 'Certifications and Listings', 'Color/Finish', 'Color/Finish Family', 'Cut-Out Front to Back Width (in.)', 'Cut-Out Height (in.)', 'Cut-Out Left to Right Length (in.)', 'Door Swing/Style', 'Exhaust Fan Speeds', 'Exhaust Maximum CFM', 'MFG Brand Name', 'Microwave Door Release', 'Microwave Features', 'Microwave Size', 'Number of One-Touch Settings', 'Number of Power Levels', 'Oven Settings', 'Product Depth (in.)', 'Product Height (in.)', 'Product Weight (lb.)', 'Product Width (in.)', 'Safety Listing', 'Sensor Cook', 'Turntable', 'Turntable Diameter', 'Vent Type', 'Wattage (watts)', 'Battery Power Type', 'Battery Size', 'Bulb Type Included', 'Bullet', 'Certifications and Listings', 'Commercial Light Type', 'Connection Type', 'ENERGY STAR Certified', 'Emergency run time (min.)', 'Fixture Color/Finish', 'Fixture Color/Finish Family'], 'product_uid': [100001.0, 100001.0, 100001.0, 100001.0, 100001.0, 100001.0, 100001.0, 100001.0, 100001.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100002.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100005.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100006.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0, 100007.0], 'value': ['Versatile connector for various 90° connections and home repair projects. Stronger than angled nailing or screw fastening alone. Help ensure joints are consistently straight and strong. Dimensions: 3 in. x 3 in. x 1-1/2 in.. Made from 12-Gauge steel. Galvanized for extra corrosion resistance. Install with 10d common nails or #9 x 1-1/2 in. Strong-Drive SD screws', '12', 'Simpson Strong-Tie', 'Galvanized Steel', '1', '1.5', '3', '0.26', '3', 'Brush,Roller,Spray', '6.63 in', '7.76 in', '6.63 in', 'Revives wood and composite decks, railings, porches and boat docks, also great for concrete pool decks, patios and sidewalks. 100% acrylic solid color coating. Resists cracking and peeling and conceals splinters and cracks up to 1/4 in.. Provides a durable, mildew resistant finish. Covers up to 75 sq. ft. in 2 coats per gallon. Creates a textured, slip-resistant finish. For best results, prepare with the appropriate BEHR product for your wood or concrete surface. Actual paint colors may vary from on-screen and printer representations. Colors available to be tinted in most stores. Online Price includes Paint Care fee in the following states: CA, CO, CT, ME, MN, OR, RI, VT', 'Soap and Water', 'Browns / Tans', 'Tugboat', 'Yes', '1 GA-Gallon', '75', 'Yes', 'Exterior', 'BEHR Premium Textured DeckOver', 'Yes', 'Solid', 'Exterior Paint/Stain', 'Restoration Coating', 'Cottage', '119:100:086', 'No', '6', 'No', 'Solid', 'Yes', 'No', 'Combo Tub and Shower', 'No', 'Includes the trim kit only, the rough-in kit (R10000-UNBX) is sold separately. Includes the handle. Maintains a balanced pressure of hot and cold water even when a valve is turned on or off elsewhere in the system. Due to WaterSense regulations in the state of New York, please confirm your shipping zip code is not restricted from use of items that do not meet WaterSense qualifications', 'ADA Compliant,CSA Certified,IAPMO Certified', 'Chrome', 'Chrome', '1/2 In.', 'No Additional Features', 'Handles,Pressure Balance/Scald Guard', 'Bath Faucet', '2.5', 'Lever', 'Delta', 'Single Handle', '1', '1', '15.28', '24', '7.09', '4.06', 'Fixed Mount', 'Rain', 'Over the Range Microwave', '18.5 in', '17.13 in', '29.94 in', "Spacious 1.9 cu. ft. capacity accommodates dinner plates and casserole dishes with ease. 1100 watts of cooking power and 10 cooking levels make cooking and reheating a snap. 400 CFM venting system whisks smoke, steam and odors away from the cooktop to keep your kitchen air clear. Single piece door with built-in touch-activated control console streamlines the exterior for a sleek, modern look and easy cleanup. Cook with confidence with the Sensor and Programmed cooking cycles and options. Sensor cycles include: Steam/Simmer, AccuPop and Potato for fast prep of family favorites. Kids' Menu: it's simple, it's fast. The Kids' Menu is preset with cooking times and power levels for a variety of favorites like pizza and chicken nuggets. Now after school snacks don't have to be an afternoon hassle. TimeSavor Plus True Convection cooking uses a 1600-watt element and a fan to circulate heat over, under and around food for fast cooking and even browning. Industry leading CleanRelease non-stick interior requires no special cleaners. A damp cloth or sponge is all thatâ\x80\x99s needed to remove cooked-on spills and splashes. Recessed turntable's on/off feature is especially helpful when cooking with plates that are larger than the turntable. Automatic interior incandescent light and large window help you track cooking progress. 4-speed fan with Auto Vent Fan function. To keep the microwave oven from overheating, the vent fan will automatically turn on at high speed if the temperature from the range or cooktop below the microwave oven gets too hot. Replaceable charcoal and dishwasher safe mesh filters takes grease and other impurities out of the air. 90° hinge. With this innovative hinge design you can install this model next to a wall and still open the door easily. Limited 1-year warranty. Convertible venting. Can be installed as vented or non-vented (recirculating) to fit a variety of installation needs. AccuPop cycle senses the perfect pop every time. It adapts cooking time using a sound sensor that measures the time between pops so you don't have to worry about bag size or excessive unpopped kernels. Now you can finally watch the movie, not the microwave. Included items: convection rack, SureMist steamer and cooking rack. Included cooking rack lets you microwave on two levels, so you can cook several items at once", '1.9', '1-UL Listed', 'Stainless Steel', 'Stainless', '12', '17.13', '30', 'Right to Left Swing', '4', '400', 'Whirlpool', 'Pull', 'Charcoal Filter,Clock,Convection,Cooktop Lighting,Interior Light,Microwave Rack,Nightlight,One Touch Cooking,Removable Filter,Steam Cook,Timer,Turntable,Turntable On/Off Option', '30 in.', '6', '10', 'Defrost,Keep Warm,Sensor Cook', '18.5', '17.13', '67.1', '29.94', 'UL', 'Yes', 'Yes', '14', 'Convertible', '1100', 'Ni-Cad', '.Built-In', 'LED', 'Advanced LED technology is dependable and energy efficient. 2 adjustable heads allow you to direct light where it is needed. Engineering-grade thermoplastic housing is impact-resistant, scratch-resistant and corrosion-proof. Integrated LEDs means no bulbs are required. Typical life of the LEDs is 10 years of maintenance-free operation. Black housing has a compact low-profile design. Sealed, maintenance-free Ni-cad battery delivers 90 minute capacity to the LEDs. Dual voltage input capability (120 to 277-volt). Easily installs to wall or ceiling. UL damp-location listed', '1-UL Listed,OSHA Compliant', 'Exit and Emergency', 'Hardwired', 'No', '90', 'Black', 'Black']}, columns=['product_uid', 'name', 'value'])
我设置索引,然后取消堆叠name
列:
df.set_index(['product_uid', 'name']).unstack('name')
这会产生以下内容(这正是我想要的):
value \
name Appliance Type Application Method
product_uid
100001 NaN NaN
100002 NaN Brush,Roller,Spray
100005 NaN NaN
100006 Over the Range Microwave NaN
100007 NaN NaN
\
name Assembled Depth (in.) Assembled Height (in.)
product_uid
100001 NaN NaN
100002 6.63 in 7.76 in
100005 NaN NaN
100006 18.5 in 17.13 in
100007 NaN NaN
\
name Assembled Width (in.) Bath Faucet Type Battery Power Type
product_uid
100001 NaN NaN NaN
100002 6.63 in NaN NaN
100005 NaN Combo Tub and Shower NaN
100006 29.94 in NaN NaN
100007 NaN NaN Ni-Cad
\
name Battery Size Built-in Water Filter Bulb Type Included
product_uid
100001 NaN NaN NaN
100002 NaN NaN NaN
100005 NaN No NaN
100006 NaN NaN NaN
100007 .Built-In NaN LED
... \
name ... Spray Pattern Time before recoating (hours)
product_uid ...
100001 ... NaN NaN
100002 ... NaN 6
100005 ... Rain NaN
100006 ... NaN NaN
100007 ... NaN NaN
\
name Tintable Transparency Turntable Turntable Diameter UV Resistant
product_uid
100001 NaN NaN NaN NaN NaN
100002 No Solid NaN NaN Yes
100005 NaN NaN NaN NaN NaN
100006 NaN NaN Yes 14 NaN
100007 NaN NaN NaN NaN NaN
name Vent Type Waterproof Wattage (watts)
product_uid
100001 NaN NaN NaN
100002 NaN No NaN
100005 NaN NaN NaN
100006 Convertible NaN 1100
100007 NaN NaN NaN
但是,当我在实际数据集上尝试此操作时,我会得到MemoryError
Traceback (most recent call last):
File "<pyshell#172>", line 1, in <module>
attr_df.set_index(['product_uid', 'name']).unstack('name')
File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 3801, in unstack
return unstack(self, level)
File "C:\Python34\lib\site-packages\pandas\core\reshape.py", line 404, in unstack
return _unstack_frame(obj, level)
File "C:\Python34\lib\site-packages\pandas\core\reshape.py", line 445, in _unstack_frame
return unstacker.get_result()
File "C:\Python34\lib\site-packages\pandas\core\reshape.py", line 147, in get_result
values, value_mask = self.get_new_values()
File "C:\Python34\lib\site-packages\pandas\core\reshape.py", line 184, in get_new_values
new_values = np.empty(result_shape, dtype=dtype)
MemoryError
我的问题是,有没有办法unstack
块?大块拆散工作会不会有效?有没有其他方法来解决这个内存错误?我尝试了pivot_table
但我的数据没有按数字数据汇总。