Sunday 2 September 2012

AIX Interview Questions

AIX Interview Questions :

There is a certain amount of overlap between these questions. The interviewer should not ask them all, but ask a sufficient variety of questions to ascertain the applicant’s knowledge of both AIX in general and the interviewer’s operating environment.

Basic Hardware, AIX, and other Applications

Question: What levels of AIX have you worked with?
Answer:
 AIX 4.3.3 and AIX 5.1 good
 If only AIX 4.2.1, 4.1, or 3.2, bad

Question: What types of machines have you worked with?
Answer: Look for questions pertaining to the hardware, to ascertain level of competence and understanding.
 Take note of applicants who mention older technologies like “SP2” or newer technologies like “Regatta”

Question: What applications have you used with AIX?
Answer: Look for applications like HACMP, ADSM/TSM, SAP, i2, Manugistics, MQ Series, WebSphere, etc.
 An applicant with considerable HACMP experience is likely to have quite a bit of AIX knowledge, especially LVM, disks, general TCPIP, and application enablement.


User Management

Question: Suppose that there are some users that need to run certain commands normally only accessible by root. How would you grant them access?
Answer: sudo
Question: Discuss your philosophy on granting user access to root or to an application user ID.
Answer: For root, prefer to 

Base Kernel

Question: The system is performing very slowly; you have discovered that this is due to an increasing number of defunct processes are being created. How can you determine the cause?
Answer:
1. Check to see if new defunct processes are being created, owned by the init process, with no processes being deleted at all.
2. If some defunct processes are being cleared out and the overall number of defunct processes is still growing, then this is due to an application creating more defunct processes than the init process can clean out. 
3. However, if NO defunct processes owned by init (PID 1) are being cleared out, this points to an incorrect wait entry in the /etc/inittab file. The init process is stuck on this wait entry and will not cycle through the defunct processes until that entry finishes. The solution is to change that “wait” entry to a “once” entry.

Question: You get a message that the system can’t fork any additional processes? What do suspect to be the problem?
Answer: Either paging space is full or maxuproc (maximum number of user processes) is set too low.

Question: You’re running some commands and getting an error that says “srcmstr daemon is not running”. Obviously, it is, because the machine appears to be functional. What is the likely cause? 
Answer: Someone has updated some of the base kernel filesets without rebooting the machine.

Question: How do you run 64-bit applications on AIX 4.3.3? On AIX 5.1? What else might you have to consider on an AIX 5 machine?
Answer: 
• AIX 4.3.3 only comes with a 32-bit kernel, so you need to ensure that 64-bit application support is enabled (via the load64bit entry in /etc/inittab or done via SMIT).
• For AIX 5.1, either use the same method as AIX 4.3.3 OR configure the system to use a true 64-bit kernel. If you use a true 64-bit kernel, any 32-bit application will NOT run.
ODM

Question: Describe one situation where it might be appropriate to edit the ODM.
Answer: E.g., need to make parameter changes to a device that is open, and that cannot be modified without closing the device (i.e., a network device), and where the machine can be easily rebooted following the change.

Question: Describe how to manually edit the ODM. What commands should be used?
Answer: 
1. Use the odmget command to grab the appropriate entries from the ODM, into a text file. 
2. Edit these entries. 
3. Use the odmdelete command to delete the current entries.
4. Use the odmadd command to add the new entries.
5. If these are device configuration files, use the savebase command to save them into the boot image.

Question: When might the savebase command be used?
Answer: After making manual changes to the CuAt, CuDv, CuDvDr, CuDep, or CuVPD ODM classes (aka the device configuration database), these classes must be saved into the boot image in the boot logical volume. If they are not saved, there is a risk run that the changes won’t be saved at boot. Note that the chdev command (and others) will cause savebase to be run.

Booting

Question: What key would you use at boot time to signal the machine to boot into service mode?
Answer: F5 or 5 will signal the machine to boot from the internal boot list. F1 or 1 will bring the machine into SMS mode. The F[x] keys are used for graphical consoles; the others are used on ASCII consoles.

Question: What is the effect of setting the normal mode bootlist to cd0 first and hdisk0 second?
Answer: The machine will boot from the CDROM into maintenance mode automatically if a boot CD has been inserted. If not, the machine will boot from the disk into normal mode.

Question: A machine hangs at boot because the network is inaccessible or an NIS server is accessible. How can you recover?
Answer: With inaccessible network (LED 581 for a long time), the machine will generally recover after timing out. With a NIS problem, probably have to boot into service mode

Disk Device Configuration

Question: You are adding a number of new internal SCSI disks to a pSeries 680 and a new drawer of SSA drives. How does cfgmgr operate different when configuring each?
Answer: Internal SCSI drives are discovered one at a time by order of their SCSI ID (i.e., their slot location). SSA drives are discovered by their SSA serial number, which has nothing to do with their slot in the drawer.

Question: After a reboot or after running cfgmgr, a particular disk is listed twice. Give some possible reasons why and how you might narrow them down.
Answer:
1. The disk can be detected, but it has failed and since cfgmgr can’t verify the physical volume identifer, it marks the old disk Defined and then creates a new disk entry with no PVID.
2. You are running in a multi-path (either SAN or twintailed SCSI) environment and the system is (properly) seeing the disk multiple times.
• If one disk is Defined and the other Available, or lspv shows the second instance as having no PVID, then it’s probably #1. Also check the error report.
• If the disk is seen as being Available twice (and lspv lists it twice), it’s probably #2.

Question: What considerations must you give when creating RAID-0 or RAID-1 disks in an SSA array?
Answer: You cannot cable RAID-0 or RAID-1 arrays such that there are multiple adapters in the loop (as opposed to RAID-5).

Question: What’s the difference between RAID-0 and RAID-1?
Answer: RAID-0 is hardware striping. RAID-1 is mirroring.

Question: When using an SSA array, what are the tradeoffs to consider between using a large RAID-5 array and using LVM mirroring?
Answer: RAID-5 would waste fewer disks, but it would be slower.

LVM/JFS

Question: How do you properly mirror rootvg to protect the OS from crashing should a disk fail?
Answer: 
1. Use the mirrorvg command to mirror the logical volumes
2. Run the bosboot command to reboot the boot image and update the disk boot record on both disks that contain a copy of the boot logical volume.
3. Run the bootlist command to add both disks to the boot list.
4. Mirror all non-rootvg paging spaces as well as those in rootvg.
Dump device?
Paging space? 

Question: Why is it important to mirror non-rootvg paging spaces, and what might be the effect if this is not done?
Answer: The OS allocates pages in a round-robin fashion across all paging devices. If a non-rootvg disk containing a paging space fails, the system will likely crash unless it is mirrored. Putting paging spaces on RAID disks is not recommended due to the performance implications.

Question: What kind of considerations can be made with regard to mirroring rootvg in a non-mission critical environment where maximizing disk space is important?
Answer: You might be able to get away with not mirroring any paging spaces, the boot logical volume, or any non-critical file systems. If a disk crashes, you might have to perform some maintenance in service mode.


Question: Provide the steps to move a file system from one volume group to another WITHOUT recreating and restoring it.
Answer:
1. Unmount the file system
2. Use the cplv command to copy the logical volume to the new volume group.
3. Use the chfs command to update the dev and log entries for the file system.
4. Run fsck.

Question: Why is directly editing /etc/filesystems, to change information for a file system, a bad idea?
Answer: The entries from /etc/filesystems are usually stored in the logical volume control block for each logical volume. If you ever had to run importvg for this volume group (i.e., after a reinstall or if using HACMP), the changes would be lost.

Question: You can’t unmount a file system (AIX says it’s in use). How do you determine what process is keeping it open?
Answer: fuser or lsof.

Question: The “df” command shows that the file system is 100% full, but “du” shows no files in the file system. Why, and how do you debug this?
Answer: A running process is holding open a file descriptor that references a large amount of space either not saved as a file or references a file that has been deleted. Use “lsof” or “fuser” to track down this process.

Question: The “df” and “du” commands indicate that the file system is 50% full, but you notice that the total size of all files, as given by “ls” is greater than the size of the file system. How is this possible?
Answer: These are sparse files, where the size of the file is larger than the amount of space it takes up. The file contains a large of “null blocks”.

Question: What steps should you follow to replace a failed hdisk?
Answer:
1. Remove all allocated logical volumes (if not mirrored) or LV copies (if mirrored).
2. Remove disk from the volume group.
3. Remove disk definition (and pdisk, if it’s an SSA drive mapped 1-to-1)
4. Physically replace disk
5. Run cfgmgr.
6. Add disk to volume group and recreate LVs or LV copies.

Network

Question: Basic network configuration
Answer: "smit mktcpip"?

Question: What file is used to tell AIX to use local /etc/hosts entries instead of DNS?
Answer: /etc/netsvc.conf. Specifically, the entry is “hosts=local,bind”.

Question: There is a conflict between a machine’s IP address and the DNS entry. Unfortunately, the DNS server is a Win2K server managed by a little old lady who works once a week. How do you get around this?
Answer: Put an entry in /etc/hosts, and then use /etc/netsvc.conf to force AIX to read /etc/hosts first.

Question: You’ve properly exported a file system on a server. You’ve properly set up the NFS file system definition on the client, and the proper daemons are running. However, when you try to mount it, you get an error:
mount: 1831-009 aixnim not in hosts database
mount: 1831-008 giving up on:
aixnim:/nim
A route to the remote host is not available.

What two options do you have to resolve this?
Answer: 
1. Check /etc/resolv.conf for an entry to the appropriate nameserver.
2. Add it to /etc/hosts.

Question: What is resolv.conf used for?
Answer: To provide a list of nameservers and domains to search for hosts.

Question: Your machines are in the xyz.com domain. You want to be able to look up hosts in the abc.com domain. What should you do so that the ping command will find a route to those hosts?
Answer: 
1. Add the nameserver for that domain to /etc/resolv.conf.
2. Make sure you have a route to those nameservers.

Question: Where is the default gateway stored in AIX? Where are other routes stored?
Answer: As ODM entries in the CuAt ODM class.

SMIT

Question: How would you use SMIT to create scripts for future use?
Answer: Use the script.script file for commands that have been executed. Or, use the -x flag with smit.

Question: You’re using SMIT and are getting a strange error during its execution. How do you debug what SMIT is doing?
Answer: Use the -D flag on SMIT, and look at the smit.log file for the AIX commands being executed by SMIT and the error messages being generated.

Performance

Question: A project manager claims that a system is CPU-bound. What commands do you run to verify this, and what sort of output are you looking for? 
Answer: 
1. Run vmstat and look at the wait, idle, CPU utilization, runq and blockq parameters. 
2. A CPU-bound system will exhibit relatively low wait and idle percentages and a high degree of CPU usage. 
3. In addition, the runq parameter will average the number of CPUs on the system and the blockq parameter will be, on the average, high.

Question: You have a 32-way Regatta. Why is vmstat not an accurate tool to measure CPU utilization for each processor?
Answer: vmstat takes a system average. The sar command can show the CPU utilization for a particular processor.

Question: If you have Workload Manager policies defined for several classes of processes, and the machine is under little load, what happens?
Answer: Nothing. WLM policies are not enforced until there is contention for resources.

Installation Issues

Question: What is the difference between a migration install, a preservation install, or an overwrite install?
Answer: 
• A overwrite install recreates rootvg and the file systems and installs everything from scratch. Other non-OS file systems are deleted.
• A preservation install preserves rootvg. It only overwrites the base OS (to the base level from the installation image) and thus only affects /, /usr, /var, and /tmp. Your own modified config files are generally saved to a special location and can be recovered.
• A migration install is used to take an existing AIX machine to a new AIX level (i.e., 4.3 to 5.1).

Question: You need to update an AIX 4.3.2 machine to AIX 4.3.3. How do you do this?
Answer: Take the AIX 4.3.3 CDs and perform an “update_all”.

Question: Define under what circumstances two different versions of a product can coexist on the same machine in AIX.
Answer:
1. The names of the filesets for that product are different when stored in the ODM.
2. The product does not store names in the ODM.

Question: What considerations do you need to make when cloning system X to system Y?
Answer: Make sure that all the drivers needed for system Y and its associated hardware are included in the image for system X.

Question: A user wants you to install some set of commands on system X; these commands already exist on system Y. How would you tell which filesets to install?
Answer: Run “lslpp -w 
” to see what fileset provides that command.

Question: You’ve installed a new fileset. How do you tell which files are provided?
Answer: Run “lslpp -f 
” to see what files are installed by that fileset.

AIX Patches and Maintenance

Question: Discuss your strategy for applying and committing maintenance
Answer:
• A prudent administrator will generally apply fixes first if those fixes have never been used before in the environment.
• After an evaluation period, the fixes can be committed.
• Subsequent installations of those fixes on other machines can then be committed.
• If these fixes are related to the base OS, a mksysb should be created first.

Question: What are the AIX Maintenance Levels and how do they differ from normal fixes?
Answer: AIX Maintenance Levels are similar to an NT service pack. They are generally considered safe to install “all at once”. Normal AIX fix collections may or may not contain fixes that could interfere with each other; however, maintenance levels are generally considered safe.

Question: Describe the difference between a PTF, and APAR, and a maintenance level.
Answer: An APAR is a specific patch that may update one or more filesets. A PTF is an IBM term for a collection of APARs commonly shipped together as a common fileset update. A maintenance level is a collection of APARs (also generally ordered as an APAR).

Question: Your machine is running AIX 4.3.3 with maintenance level 10 applied. “oslevel” (no option) reports “4.3.2.0”. Why?
Answer: There are some filesets installed that are at a level BELOW what is defined for 4.3.3.

Question: What is the difference between applying and committing an APAR?
Answer: Applying it saves the old versions of the files so that you can back off the new version. Committing the APAR removes the old versions.

Question: How do you back off a committed set of patches?
Answer: You have to forcibly install the base level of the affected filesets, while NOT reinstalling any prerequisites, and then re-patch back to the appropriate level.



Question: You’re installing a new IBM 6228 fiber card into an existing 4.3.3 system for the first time. What problems might you encouter?
Answer: The AIX 4.3.3 CDs don’t provide any base support for IBM 6228 fiber cards. You must download the drivers in the form of an APAR and install them separately. AIX 5.1 provide fiber support in the base CDs.

Misc. / Other Products

Question: What TERM setting would you use on an IBM 3153 monitor?
Answer: ibm3151

Question: List some common sites to download freeware tools for AIX?
Answer: aixpdslib.seas.ucla.edu
purdue.edu
www.bullfreeware.com
IBM repository of Linux freeware for AIX

Question: A group of developers wants to know if they will be able to install different versions of the Java runtime on a machine. Specifically, they want to install the base level filesets for versions 1.2.2, 1.3.0, and 1.3.1, as well as a separate set of patches. What do you tell them?
Answer: JDK/JRE 1.2.2, 1.3.0, and 1.3.1 can all coexist on a machine, as they are completely separate sets of filesets. However, patches for these versions cannot coexist with the base levels.

Question: Versions 3.6.4 and 3.6.6 of the IBM CSet++ compiler are comprised of the same basic set of fileset names, meaning that they normally cannot be installed on the same machine at the same time. What solution (unsupported by IBM) do you suggest? What considerations must you give to this environment?
Answer: Install the second set of software on a second machine. Take all the files installed by those filesets and copy them to the other machine into another directory. The ODM will not recognize that the second set of filesets are installed, but the product will still be functional. Note that this is unsupported by IBM, but there’s a fine line between “unsupported” and “functional”. Note that you can’t very easily apply patches to this environment.
Question: What does the LIBPATH variable do?
Answer: It controls the order in which libraries will be discovered so as to arbitrate between libraries with the same name.

Question: When IBM some particular sequence of commands or environment is “not supported”, what do they mean?
Answer: That it may work, but that won’t SUPPORT it because they haven’t TESTED it.

1 comment:

  1. Hi There,

    Jeez oh man,while I applaud for your writing , it’s just so damn straight to the point AIX Interview Questions.

    There are many systems which are Unix-like in their architecture. Then why the distinctions between Unix and Unix-like systems have been the subject of heated legal battles, and the holders of the UNIX brand?

    It was cool to see your article pop up in my google search for the process yesterday. Great Guide.
    Keep up the good work!

    Many Thanks,
    Abhiram

    ReplyDelete