Everyday DBA for Progress OpenEdge

Report
Everyday DBA for Progress OpenEdge
What You Need to Know, and When to Call for Help
David Eddy
Linx Tailored Development Services
©2010 Linx ADS Pty Ltd
What is the DBA responsible for?
Users must be able to connect to,
read from, and update the database
whenever they need to.
©2010 Linx ADS Pty Ltd
2
Sounds simple... What does it mean?
• The database must exist and be up to date and
consistent
• It must be on-line when users need to connect
• Users must be able to connect
• DB operations should be fast enough, and other
applications should not be unduly impacted
©2010 Linx ADS Pty Ltd
3
It may also involve...
• Ensuring user actions are monitored and/or
controlled
• Protecting the database from unauthorised access
• Complying with legislative requirements or other
standards
©2010 Linx ADS Pty Ltd
4
Agenda
•
•
•
•
•
•
Existence
Uptime
Connectivity
Security
Auditing
Performance
©2010 Linx ADS Pty Ltd
5
Hang on! That’s quite a bit!
• Some tasks need to be done weekly, daily, or
sometimes even hourly
– You should know how to do these, and when to do them
• Some tasks only need to be done occasionally
– For most users, it’s easy to forget how to do these
– Call in the experts
• Don’t panic
– Progress databases largely require no administration once
set up (for small to medium users).
©2010 Linx ADS Pty Ltd
6
Existence
The database must exist, be current, and be usable.
©2010 Linx ADS Pty Ltd
7
Existence
Where is your database today?
• Your database must exist
– If it goes away, you must be able to get it back
• Your database must be current
– Lost data costs money
• Your application must exist and be current
• What challenges are there and how will you
address them?
©2010 Linx ADS Pty Ltd
8
Existence
The tender art of backing up
• Backup – everybody does it
– On-line, offline, mirror split
• Restore – can you do it?
– Have you actually tested your backup recently?
– Clear, available documentation
• What if you’re sick and the restore has to be done by someone
else?
• What about the application?
– Or the system data? User password database?
©2010 Linx ADS Pty Ltd
9
Backup techniques
• For small and medium sites:
– KISS: Use probkup (online or offline)
probkup [online] dbname backupfilename –com
– Then back up backup file, applications, and critical system
files to removable media (and take them offsite)
• Large sites may wish to use mirror split (i.e. arraybased backup)
– proquiet dbname –C enable (pauses update activity)
– Mirror split or snapshot commence
– proquiet dbname –C disable
©2010 Linx ADS Pty Ltd
10
Restoring a backup
• Technique will correspond with backup
• For probkup-based backups:
– Restore DB backup file (and possibly other stuff) from
removable media into scratch area
– Restore database:
prorest dbname backupfile
– Copy any other needed parts (e.g. application code) to
proper locations
©2010 Linx ADS Pty Ltd
11
Why application?
• Application corresponds with database
• Changes to DB structure, or application code
changes, may mean restored DB will not run
with current application code
– Not usually an issue in practise
– But needed anyway in case of total disk loss
©2010 Linx ADS Pty Ltd
12
Existence II: Preventing data loss
The case for after-imaging
• If you have to restore from backup, you’ve lost
everything that’s happened since
– Paper-based operations? No problem (just $ to
rekey)
– Telephone or web-based? You’re in trouble!
• Solution: After-image logging
– Trivial to implement
– Self-managing in OpenEdge 10 and higher
– No excuse not to!
©2010 Linx ADS Pty Ltd
13
Implementing after-imaging (AI)
It’s easy!
• Add after-image extents to DB if you need to
– 5 variable-sized extents ok for most small-medium
sites
• Create a directory to hold archived afterimage data
• Enable after-imaging
©2010 Linx ADS Pty Ltd
14
Implementing AI
Adding AI extents
• Create a file e.g. “ai.st” to
describe AI extents (one line
per extent needed)
• Add to database
# AI extents for somedb
a /path/to/directory
a /path/to/directory
...
a /path/to/directory
prostrct {add|addonline} dbname ai.st -validate
prostrct {add|addonline} dbname ai.st
©2010 Linx ADS Pty Ltd
15
Implementing AI
Enabling after-imaging in the database
# Shut down DB
# Enable after-imaging and AI log management
rfutil dbname –C mark backedup
rfutil dbname –C aimage begin
rfutil dbname –C aiarchiver enable
# Modify db startup to include argument
#
“-aiarcdir /path/to/aiarchive/dir”
# Restart DB
©2010 Linx ADS Pty Ltd
16
Existence III: Up in smoke
Planning for business continuance
• What planning do you need to make for
disasters?
– What if there’s a fire at the office?
– Or total loss of electrical power?
• If you’re a multi-site operation, you should
consider “DR”
– Would being able to run on an off-site system
benefit the business?
©2010 Linx ADS Pty Ltd
17
Replication
A complex topic
• Key part of Business Continuance planning
– But only a part
•
•
•
•
•
OpenEdge Replication
Trigger-based replication
Array-based replication
Log shipping
Often an area needing expert advice
©2010 Linx ADS Pty Ltd
18
Information vacuum sucks
• Plans are of no use if nobody knows of them
– You might be sick or injured (or in the Bahamas)
• Clear, written checklists and instructions
– Store a copy with your off-site backups
– Include exact commands to use in case must be
performed by unfamiliar personnel
©2010 Linx ADS Pty Ltd
19
DBA tasks for: Existence
• Take backups (and manage them properly)
• Test restore
– Test fact that you can read your backups
– Test documented procedure in case you’re not there
– Document how long restore takes
• Document disaster scenarios and recovery
procedures
• Socialise existence of documentation
©2010 Linx ADS Pty Ltd
20
Uptime
The system needs to be available whenever the users need it.
But it need not be available when they don’t need it.
UPTIME
©2010 Linx ADS Pty Ltd
21
Uptime requirements
Identify your needs
• Most businesses have modest needs, e.g. 5x16 or 6x24
– e.g Mon-Fri, 7am-11pm or Mon 6am -> Sat 9pm
• Very few business are genuinely 24x7
– Exceptions are typically web-response systems
– Large businesses
– 24-hour (petrol station, convenience store, Crown Casino)
• Nobody will care if you bust a gut making the system
available when they don’t need it
• Uptime costs money – how much do you need?
©2010 Linx ADS Pty Ltd
22
Other uptime requirements
• Do you need to be able to remotely administer
your database?
– Use free OpenEdge Explorer if 10.1C+
• Should your database auto-restart if it goes
down?
– Use OpenEdge Management or bespoke scripting if
needed/appropriate
• What non-database services are needed?
– AppServers, WebSpeed, batch procedures, interfaces
e.g. Sonic
©2010 Linx ADS Pty Ltd
23
AdminServer
• Consider using AdminServer for database
administration
– Use is required anyway for WebSpeed, AppServer, etc so
you may as well use it.
• Database configuration stored in
$DLC/properties/conmgr.properties
• Administered using “dbman” commands
– Database have nicknames instead of pathnames
• Preferred method if using OpenEdge Explorer or
Management (i.e. Web-based admin interface)
– Default for OpenEdge 10.2B+
©2010 Linx ADS Pty Ltd
24
What causes uptime problems
•
•
•
•
•
•
•
•
Human error
Lack of DB free space
Lack of log free space
Lack of filesystem free space
Reaching DB extent size limits
Reaching area size limits
Hard kill of client processes (sometimes)
Silly stuff:
– Removing .lk file
– Changing host name
– System crash
©2010 Linx ADS Pty Ltd
25
Free space management
• Space/size issues are your primary uptime risk
• Address by combination of:
– Risk mitigation
– Monitoring
©2010 Linx ADS Pty Ltd
26
Free space: Risk mitigation strategies
• Enable large file handling (for Enterprise Server)
• Ensure all storage areas have variable extent
• Test whether on-line extent add works for you
– Might not if clients don’t have DB extent file permissions
•
•
•
•
•
Establish a usage baseline
Make sure AI logs are archived/emptied
Make sure BI log doesn’t get too big
Monitor free space at all levels
Be aware of area maximum size given recs/block
setting (_AreaThreshold VST)
©2010 Linx ADS Pty Ltd
27
Free space: Monitoring
• Monitor database free space for each storage
area independently (including log areas)
– prostrct statistics
– _AreaStatus VST
• Check for variable extent size and growth
• Monitor filesystem free space
– In filesystems containing variable-size extents
©2010 Linx ADS Pty Ltd
28
Monitor usefully
• Just knowing kB free is often useless
– How many kB per week are consumed?
• Establish a usage baseline
• Describe free space in terms of number of weeks
– Long-term average consumption
– 7-day moving average consumption
• Also monitor if 7-day average moves too far from
long-term average
©2010 Linx ADS Pty Ltd
29
Automate
• Automated monitoring is most useful
– Especially when combined with exception
reporting
• Can be hard to do if you’re not systemsliterate
– Call in the experts to set it up for you
– Learn to interpret output
©2010 Linx ADS Pty Ltd
30
Bad things!
Hot-spots to watch out for
• Less than 2 weeks free space in any data area
– Remember: When one area fills, the whole DB is full regardless
of how much free space there is in other areas
• Variable extent >1GB
– Unless you have large file handling enabled
• Very little (<4GB) free space in filesystem
– If there are variable-sized extents there
• Rapidly-growing log files in filesystem with variable-sized
extents
• BI log growing rapidly and no large file handling
• BI log large (e.g. >4GB) can have extended crash recovery
time
• Only one empty AI log extent
©2010 Linx ADS Pty Ltd
31
More bad things
...a little more indirect, this time
• Long-running transactions
– Cause BI log growth
• No empty AI log extents
– Backups will fail until corrected
– Log space exhaustion prevents updates and shuts DB
• 2 or more locked AI log extents
– Replication has probably stopped
– Unreplicated data prevents AI log space reuse
– Measure with rfutil db –C aimage extent list
©2010 Linx ADS Pty Ltd
32
Large file handling for OpenEdge
• Supported in most V9 and all OE10 ports
– Enterprise Server only (not Workgroup or
Personal)
– Filesystems need LFH also (beware AIX!)
• How to tell whether it’s enabled:
– You have an extent larger than 2GB
– Use proutil db –C describe (10.1C+)
– Look in the log file (after “multi-user session
begin”) for “large-file handling enabled”
©2010 Linx ADS Pty Ltd
33
DBA tasks for: Uptime
• Monitor free space in:
– All DB areas
– Filesystems
•
•
•
•
Monitor size of BI log
Monitor number of empty AI log extents
Watch for variable-size extent growth
Know how to add extents
– On-line, ideally
©2010 Linx ADS Pty Ltd
34
Connectivity
The database should have enough “slots” for users to be able
to connect.
©2010 Linx ADS Pty Ltd
35
Connectivity
Is primarily concerned with...
•
•
•
•
Remote vs. Local connections
Maximum total DB connections
Server pools
Lock table size
©2010 Linx ADS Pty Ltd
36
Remote vs. Local connections
• You should use local connections where feasible
– Best performance
– Minimum resource usage
• Remote connections are useful for:
– Off-system clients
– Ensuring database stability if client termination causes
problems
– Segregating users into pools (access denial, user count
control, performance)
– Performance in specialised circumstances (e.g. highlyasymmetric NUMA)
©2010 Linx ADS Pty Ltd
37
Remote connections
What makes a connection remote
• Presence of –S parameter in causes remote
connection
– Beware, this parameter is needed on DB server to
allow remote connections
– Usage of common parameter lists for client &
server is therefore a Bad Idea
©2010 Linx ADS Pty Ltd
38
Monitoring connectivity
What to monitor
• Number of connections vs. Maximum
– Global for database
– Per server pool
• Number of locks vs. Maximum
©2010 Linx ADS Pty Ltd
39
Monitoring connectivity
How to monitor
• Promon
– R&D, 1, 13: Overall connections, overall lock
usage
– R&D, 1, 2: Users per server, number of free slots
per server, number of free server slots
• VSTS: _Connect, _Server, _DbStatus
©2010 Linx ADS Pty Ltd
40
Server pools
• By default, 1 server pool only
• Multiple pools recommended if you
frequently use SQL-92
• Multiple pool setup can be hard to ‘get’
– Straightforward but frequently misunderstood and
only occasionally needed
– An example where calling the experts can save
you some headaches
©2010 Linx ADS Pty Ltd
41
DBA tasks for: Connectivity
• Monitor usage of constrained resources
– Global connections, server connections, locks
• Make sure there is sufficient headroom to allow
for burst usage
– Connections: At least 10% free
– Locks: Limit ideally at least 1.3x max usage
• You may have connection limit policy to enforce
• Connections are not necessarily licences
©2010 Linx ADS Pty Ltd
42
Monitoring Tools
Some people prefer to buy their tools
• OpenEdge Management
(http://web.progress.com/en/openedge/openedge-management.html)
– Monitoring, trending, alerting, job/report execution,
remote monitoring, files, cpu, disk, network
• ProTop (http://www.greenfieldtech.com/articles/protop.shtml)
– Free, monitoring only
• Monitoring infrastructure kits
– E.g. Nagios, Hyperic
©2010 Linx ADS Pty Ltd
43
Security
Many things to many people
©2010 Linx ADS Pty Ltd
44
Security means many things
•
•
•
•
•
Authentication (identification)
Authorization (limitation of action)
Encryption (prevention of snooping)
Integrity (detection/prevention of attacks)
Key part of larger concerns
– Auditing
– Legislative requirements
©2010 Linx ADS Pty Ltd
45
• Know how legislation or Best Practise affects
you
– Aus: TFN management
– USA: HIPPA, SoX, P.I. Data (Mass.)
– Europe: Personal identification data
• Can be part of a competitive position
– Web commerce
©2010 Linx ADS Pty Ltd
46
Basic tasks
• Filesystem permissions for DB containers
– Disallow deletions
• Use _User table for key user identification e.g.
DBA
• Take advantage of pre-compiled permissions
to limit interactive accessibility of DB
©2010 Linx ADS Pty Ltd
47
More advanced stuff
• Use CLIENT-PRINCIPAL and use it to assert trusted
identity in AppServers
• Single sign-on
• Encryption (field & table level)
• SSL plumbing
• System hardening
• Many options, often with substantial tradeoffs
– Get expert advice
©2010 Linx ADS Pty Ltd
48
DBA tasks for security
• Mostly about setup, procedure, and
sometimes programming
– Little maintenance needed
• Difficult to maintain skill
©2010 Linx ADS Pty Ltd
49
Auditing
Who did what and when... inarguably
©2010 Linx ADS Pty Ltd
50
Auditing
Auditing is...
• Recording of changes in DB
– Often triggered by legal requirements
– Sometimes pure paranoia
• Maintenance of queryable archive
– Investigate prior questionable activity
– Spot checking by auditors
• Non-repudiable
– Might have to stand up to challenge
– Also useful for preventing “black hats” hiding tracks
©2010 Linx ADS Pty Ltd
51
Complications
• Tradeoff between completeness and
performance
• Fully-enriched audit trail requires code
modifications
• Planning is required to guide recording of
events
– Need to think about how archive might be queried
or reported upon
©2010 Linx ADS Pty Ltd
52
DBA tasks for auditing
• Keep audit data from building up in live DB
– Move to audit archive DB regularly
•
•
•
•
Maintain auditing roles
Planning of events and context
Implementing changes to audit policy
Specialised help may be needed in some areas
–
–
–
–
SSO integration
Key techniques e.g. copying identity to AppServers
Help with planning events and context data
Setup of archive database
©2010 Linx ADS Pty Ltd
53
Performance
How fast to you want to go?
©2010 Linx ADS Pty Ltd
54
Performance is multi-faceted
An incomplete list...
•
•
•
•
•
•
•
•
•
•
•
•
Database tuning
System tuning and configuration
Schema design
Query formation and index usage
Performance impact of DB structural elements
Usage patterns
Code efficiency
Client tuning and parameters
Connection mode
Hardware and network configuration
Business changes
User expectations
©2010 Linx ADS Pty Ltd
55
A complex system
A complex system is a system composed of
interconnected parts that as a whole exhibit one
or more properties not obvious from the
properties of the individual parts.
• Snowballing decline
• Databases can exhibit other counterintuitive
behaviours
©2010 Linx ADS Pty Ltd
56
DBA tasks in performance
• Most aspects of performance need expert
advice
• Some tasks can be usefully managed by DBA:
– Gradually increasing execution times and/or
sustained record reads >50,000/sec: Sign of bad
indexing (code fault)
– High I/O = too little buffer memory (DB config)
• If adding memory doesn’t fix it, call the experts – it can
be a complex topic!
©2010 Linx ADS Pty Ltd
57
Summary of Tasks
There’s quite a bit you can and should do
• Hourly
– Long-running transactions and BI size/free space
– Connections, lock table usage
• Daily
– Storage area and filesystem free space (expressed as weeks
worth of consumption)
– Growth of variable-sized extents
– Backup
– AI extent archive management
• As needed
– Perform test of backup restore process
– Numerous configuration and setup tasks
– Document your DR plans and promulgate
©2010 Linx ADS Pty Ltd
58
Call the experts to...
•
•
•
•
•
•
•
Investigate performance and capacity issues
Take baseline load data for long-term trend analysis
Review your system for best practise
Help plan implementation of new features
Help solve automation problems
Execute seldom-performed tasks
Train or mentor you
©2010 Linx ADS Pty Ltd
59
Most importantly...
• Know your business!
– Understand business requirements of systems
– Plan to meet objectives
– Implement appropriate measures to suit
– Know what you need, and what you don’t need
©2010 Linx ADS Pty Ltd
60
A Final Thought
• This presentation largely canvassed topics
rather than detail
• If you want detailed instruction, talk to me
about training and mentoring
• If there’s demand, might develop a 1 or 2 day
course covering the mechanics of these topics
©2010 Linx ADS Pty Ltd
61
Questions and Answers
©2010 Linx ADS Pty Ltd
62
Thanks for your time …
©2010 Linx ADS Pty Ltd
63

similar documents